First, make sure you install and load the esquisse
package using install.packages
and library
:
install.packages("esquisse")
install.packages("ggplot2")
library(esquisse)
library(ggplot2)
library(tidyverse)
FALSE ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
FALSE ✔ dplyr 1.1.4 ✔ readr 2.1.5
FALSE ✔ forcats 1.0.0 ✔ stringr 1.5.1
FALSE ✔ lubridate 1.9.3 ✔ tibble 3.2.1
FALSE ✔ purrr 1.0.2 ✔ tidyr 1.3.1
FALSE ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
FALSE ✖ dplyr::filter() masks stats::filter()
FALSE ✖ dplyr::lag() masks stats::lag()
FALSE ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Let’s look at the relationship between exposure to pollution and visits to the ER for asthma issues.
Try creating a plot in esquisse
using the calenviroscreen
data. This dataset has a lot of variables, so first run the below code to subset it so that you’re only working with these variables: CES4.0Percentile
, Asthma
, and ChildrenPercLess10
. We will also categorize CES4.0Percentile
into three categories (high, middle, and low) to make visualization a little easier!
CES4.0Percentile
: a measure of how much pollution people in a census tract experience, relative to the other census tracts in California
Asthma
: Age-adjusted rate of emergency department visits for asthma
ChildrenPercLess10
: estimates of the percent per census tract of children under 10 years old
ces <- read_csv(file = "https://daseh.org/data/CalEnviroScreen_data.csv")
## Rows: 8035 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): CaliforniaCounty, ApproxLocation, CES4.0PercRange
## dbl (64): CensusTract, ZIP, Longitude, Latitude, CES4.0Score, CES4.0Percenti...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ces_sub <- select(ces, c("CES4.0Percentile", "Asthma", "ChildrenPercLess10"))
ces_sub <- ces_sub %>%
mutate(CES4.0Perc_cat =
case_when(CES4.0Percentile > 75 ~ "High",
CES4.0Percentile <= 75 & CES4.0Percentile >25 ~ "Middle",
CES4.0Percentile <= 25 ~ "Low"))
ChildrenPercLess10
variable to be plotted on the x-axis.Asthma
variable to be plotted on the y-axis.CES4.0Perc_cat
to the facet region of the esquisse GUI?# esquisser(ces_sub)
ggplot(ces_sub) +
aes(x = ChildrenPercLess10, y = Asthma) +
geom_point(shape = "circle", size = 1.5, colour = "#112446") +
theme_minimal() +
facet_wrap(vars(CES4.0Perc_cat))
## Warning: Removed 23 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(ces_sub) +
aes(x = ChildrenPercLess10, y = Asthma, colour = CES4.0Perc_cat) +
geom_point(shape = "circle", size = 1.5) +
scale_color_hue(direction = 1) +
theme_minimal()
## Warning: Removed 23 rows containing missing values or values outside the scale range
## (`geom_point()`).
Click where it says “point” (may say “auto” depending on how you did the last question) on the far left side and change the plot to a different type of plot. Copy and paste the code into the chunk below. Close Esquisse and run the chunk below to generate a ggplot.
ggplot(ces_sub) +
aes(x = ChildrenPercLess10, y = Asthma, colour = CES4.0Perc_cat) +
geom_line(size = 0.5) +
scale_color_hue(direction = 1) +
theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 23 rows containing missing values or values outside the scale range
## (`geom_line()`).
Launch Esquisse on any selection of the following datasets we have worked with before and explore!
co2 <- read_csv("https://daseh.org/data/Yearly_CO2_Emissions_1000_tonnes.csv")
## Rows: 192 Columns: 265
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): country
## dbl (264): 1751, 1752, 1753, 1754, 1755, 1756, 1757, 1758, 1759, 1760, 1761,...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cc <- read_csv("https://daseh.org/data/Yearly_CC_Disasters.csv")
## Rows: 970 Columns: 53
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (9): Country, ISO2, ISO3, Indicator, Unit, Source, CTS Code, CTS Name, ...
## dbl (44): ObjectId, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 19...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nitrate <- read_csv(file = "https://daseh.org/data/Nitrate_Exposure_for_WA_Public_Water_Systems_byquarter_data.csv")
## Rows: 88 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): quarter
## dbl (10): year, pop_on_sampled_PWS, pop_0-3ug/L, pop_>3-5ug/L, pop_>5-10ug/L...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# esquisser(nitrate)