library(zctaCrosswalk)
library(tidycensus)
library(dplyr)
zctaCrosswalk
was designed to work well with the
tidycensus
package. tidycensus
is currently
the most popular way to access Census data in R. Here is an example of
using it to get Median Household Income on all ZCTAs in the US:
= get_acs(
zcta_income geography = "zcta",
variables = "B19013_001",
year = 2021)
#> Getting data from the 2017-2021 5-year ACS
head(zcta_income)
#> # A tibble: 6 × 5
#> GEOID NAME variable estimate moe
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 00601 ZCTA5 00601 B19013_001 15292 1299
#> 2 00602 ZCTA5 00602 B19013_001 18716 1340
#> 3 00603 ZCTA5 00603 B19013_001 16789 966
#> 4 00606 ZCTA5 00606 B19013_001 18835 2837
#> 5 00610 ZCTA5 00610 B19013_001 21239 1919
#> 6 00611 ZCTA5 00611 B19013_001 17143 10456
Note that ?get_acs
returns data for all ZCTAs in the US.
It does not provide an option to get data on ZCTAs by State or County.
And the dataframe it returns does not provide enough metadata to allow
you to do this subselection yourself.
A primary motivation for creating the zctaCrosswalk
package was to support this type of analysis. Note that
?get_acs
returns the ZCTA in a column called
GEOID
. We can combine this fact with
?dplyr::filter
, ?get_zctas_by_county
and
?get_zctas_by_state
to subset to any states or counties we
choose.
Here we filter zcta_income
to ZCTAs in San Francisco
County, California:
nrow(zcta_income)
#> [1] 33774
= zcta_income |>
sf_zcta_income ::filter(GEOID %in% get_zctas_by_county("06075"))
dplyr#> Using column county_fips
nrow(sf_zcta_income)
#> [1] 30
head(sf_zcta_income)
#> # A tibble: 6 × 5
#> GEOID NAME variable estimate moe
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 94102 ZCTA5 94102 B19013_001 55888 8518
#> 2 94103 ZCTA5 94103 B19013_001 93143 19514
#> 3 94104 ZCTA5 94104 B19013_001 42591 34706
#> 4 94105 ZCTA5 94105 B19013_001 244662 44963
#> 5 94107 ZCTA5 94107 B19013_001 164289 16291
#> 6 94108 ZCTA5 94108 B19013_001 65392 9547
A primary motivation in creating this workflow (and indeed, this
package) was to create demographic maps at the ZCTA level for selected
states and counties. If this interests you as well, I encourage you to
copy the below code into R and view the output yourself. (Unfortunately,
R package vignettes do not seem to handle map output from the
mapview
package well). This is a powerful and elegant
pattern for visualizing ZCTA demographics in R:
library(zctaCrosswalk)
library(tidycensus)
library(dplyr)
library(mapview)
= get_acs(
all_zctas geography = "zcta",
variables = "B19013_001",
year = 2021,
geometry = TRUE)
= filter(all_zctas, GEOID %in% get_zctas_by_county(6075))
filtered_zctas
mapview(filtered_zctas, zcol = "estimate")