Load the packages.
library(tidyverse)
Load the CalEnviroScreen data from the link www.daseh.org/data/CalEnviroScreen_data.csv) and subset it so that you only have data from Fresno, Merced, Placer, Sonoma, and Yolo counties.
ces <- read_csv("https://daseh.org/data/CalEnviroScreen_data.csv")
## Rows: 8035 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): CaliforniaCounty, ApproxLocation, CES4.0PercRange
## dbl (64): CensusTract, ZIP, Longitude, Latitude, CES4.0Score, CES4.0Percenti...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ces_sub <- ces %>% filter(CaliforniaCounty %in% c("Fresno", "Merced", "Placer", "Sonoma", "Yolo"))
Use the ggplot2
package to make a plot of how diesel particulate concentration (DieselPM
; y-axis) is associated with traffic density values (Traffic
; x-axis). You can use lines layer (+ geom_line()
) or points layer (+ geom_point()
), or both!
Assign the plot to variable my_plot
. Type my_plot
in the console to have it displayed.
DieselPM
: Diesel PM emissions from on-road and non-road sources Traffic
: Traffic density in vehicle-kilometers per hour per road length, within 150 meters of the census tract boundary
# General format
ggplot(???, aes(x = ???, y = ???)) +
??? +
???
my_plot <-
ggplot(ces_sub, aes(x = Traffic, y = DieselPM)) +
geom_line() +
geom_point()
my_plot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
“Update” your plot by adding a title and changing the x and y axis titles. (Hint: use the labs
function.)
my_plot <- my_plot +
labs(
x = "Traffic density index",
y = "Diesel particulate matter",
title = "Relationship between traffic density and diesel particulate matter"
)
my_plot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Use the scale_x_continuous()
function to plot the x axis with the following breaks c(250, 750, 1250, 1750, 2250)
.
# General format
my_plot <- my_plot +
scale_x_continuous(breaks = ???)
my_plot <- my_plot +
scale_x_continuous(
breaks = c(250, 750, 1250, 1750, 2250)
)
my_plot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Observe several different versions of the plot by displaying my_plot
while adding a different “theme” to it.
# General format
my_plot + theme_bw()
my_plot + theme_bw()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
my_plot + theme_classic()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
my_plot + theme_dark()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
my_plot + theme_gray()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
my_plot + theme_void()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Create a boxplot (with the geom_boxplot()
function) using the ces_sub
data, where CaliforniaCounty
is plotted on the x axis and DrinkingWater
is plotted on the y axis.
DrinkingWater
: Drinking water contaminant index for selected contaminants. A higher value means drinking water contains a greater volume of contaminants.
ces_sub %>%
ggplot(aes(x = CaliforniaCounty, y = DrinkingWater)) +
geom_boxplot()
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_boxplot()`).
Let’s look at the plot of traffic density and diesel particulate matter again,
Use ggplot2
package make plot of how diesel particulate concentration (DieselPM
; y-axis) is associated with traffic density values (Traffic
; x-axis), where each county (CaliforniaCounty
) has a different color (hint: use color = CaliforniaCounty
in mapping).
# General format
ggplot(???, aes(
x = ???,
y = ???,
color = ???
)) +
geom_line() +
geom_point()
ggplot(ces_sub, aes(
x = Traffic,
y = DieselPM,
color = CaliforniaCounty
)) +
geom_line() +
geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Redo the above plot by adding a faceting (+ facet_wrap( ~ CaliforniaCounty, nrow = 3)
) to have data for quarter in a separate plot panel.
Assign the new plot as an object called facet_plot
.
Try adjusting the number of rows in the facet_wrap
to see how this changes the plot.
facet_plot <- ggplot(ces_sub, aes(
x = Traffic,
y = DieselPM,
color = CaliforniaCounty
)) +
geom_line() +
geom_point() +
facet_wrap(~CaliforniaCounty, nrow = 3)
facet_plot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Observe what happens when you remove either geom_line()
OR geom_point()
from one of your plots above.
# These elements are removed from the plot, like layers
Modify facet_plot
to remove the legend (hint use theme()
and the legend.position
argument) and change the names of the axis titles to be “Diesel particulate matter” for the y axis and “Traffic density” for the x axis.
facet_plot <- facet_plot +
theme(legend.position = "none") +
labs(
y = "Diesel particulate matter",
x = "Traffic density"
)
facet_plot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Modify facet_plot
one more time with a fun theme! Look into the ThemePark package. It has lots of fun themes! Try one out! Remember you will need to install it using remotes::install_github("MatthewBJane/ThemePark")
and load in the library.
# remotes::install_github("MatthewBJane/ThemePark")
library(ThemePark)
facet_plot + theme_grand_budapest()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).