First, make sure you install and load the esquisse package using install.packages and library:

install.packages("esquisse")
install.packages("ggplot2")
library(esquisse)
library(ggplot2)
library(tidyverse)
FALSE ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
FALSE ✔ dplyr     1.1.4     ✔ readr     2.1.5
FALSE ✔ forcats   1.0.0     ✔ stringr   1.5.1
FALSE ✔ lubridate 1.9.3     ✔ tibble    3.2.1
FALSE ✔ purrr     1.0.2     ✔ tidyr     1.3.1
FALSE ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
FALSE ✖ dplyr::filter() masks stats::filter()
FALSE ✖ dplyr::lag()    masks stats::lag()
FALSE ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

1.1

Let’s look at the relationship between exposure to pollution and visits to the ER for asthma issues.

Try creating a plot in esquisse using the calenviroscreen data. This dataset has a lot of variables, so first run the below code to subset it so that you’re only working with these variables: CES4.0Percentile, Asthma, and ChildrenPercLess10. We will also categorize CES4.0Percentile into three categories (high, middle, and low) to make visualization a little easier!

CES4.0Percentile: a measure of how much pollution people in a census tract experience, relative to the other census tracts in California

Asthma: Age-adjusted rate of emergency department visits for asthma

ChildrenPercLess10: estimates of the percent per census tract of children under 10 years old

ces <- read_csv(file = "https://daseh.org/data/CalEnviroScreen_data.csv")
## Rows: 8035 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (3): CaliforniaCounty, ApproxLocation, CES4.0PercRange
## dbl (64): CensusTract, ZIP, Longitude, Latitude, CES4.0Score, CES4.0Percenti...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ces_sub <- select(ces, c("CES4.0Percentile", "Asthma", "ChildrenPercLess10"))

ces_sub <- ces_sub %>% 
            mutate(CES4.0Perc_cat = 
              case_when(CES4.0Percentile > 75  ~ "High", 
                        CES4.0Percentile <= 75 & CES4.0Percentile >25 ~ "Middle",
                        CES4.0Percentile <= 25 ~ "Low"))
# esquisser(ces_sub)
ggplot(ces_sub) +
  aes(x = ChildrenPercLess10, y = Asthma) +
  geom_point(shape = "circle", size = 1.5, colour = "#112446") +
  theme_minimal() +
  facet_wrap(vars(CES4.0Perc_cat))
## Warning: Removed 23 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggplot(ces_sub) +
  aes(x = ChildrenPercLess10, y = Asthma, colour = CES4.0Perc_cat) +
  geom_point(shape = "circle", size = 1.5) +
  scale_color_hue(direction = 1) +
  theme_minimal()
## Warning: Removed 23 rows containing missing values or values outside the scale range
## (`geom_point()`).

1.2

Click where it says “point” (may say “auto” depending on how you did the last question) on the far left side and change the plot to a different type of plot. Copy and paste the code into the chunk below. Close Esquisse and run the chunk below to generate a ggplot.

ggplot(ces_sub) +
  aes(x = ChildrenPercLess10, y = Asthma, colour = CES4.0Perc_cat) +
  geom_line(size = 0.5) +
  scale_color_hue(direction = 1) +
  theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 23 rows containing missing values or values outside the scale range
## (`geom_line()`).

Practice on Your Own!

P.1

Launch Esquisse on any selection of the following datasets we have worked with before and explore!

co2 <- read_csv("https://daseh.org/data/Yearly_CO2_Emissions_1000_tonnes.csv")
## Rows: 192 Columns: 265
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr   (1): country
## dbl (264): 1751, 1752, 1753, 1754, 1755, 1756, 1757, 1758, 1759, 1760, 1761,...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cc <- read_csv("https://daseh.org/data/Yearly_CC_Disasters.csv")
## Rows: 970 Columns: 53
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (9): Country, ISO2, ISO3, Indicator, Unit, Source, CTS Code, CTS Name, ...
## dbl (44): ObjectId, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 19...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nitrate <- read_csv(file = "https://daseh.org/data/Nitrate_Exposure_for_WA_Public_Water_Systems_byquarter_data.csv")
## Rows: 88 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): quarter
## dbl (10): year, pop_on_sampled_PWS, pop_0-3ug/L, pop_>3-5ug/L, pop_>5-10ug/L...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# esquisser(nitrate)