Using R as your GIS

Applying R to community planning and working with spatial data

Eli Pousson

Baltimore City Department of Planning

August 2, 2023

1 Welcome 👋

1.1 What are we talking about?

  • A Short Introduction to R
  • Using R at the Department of Planning
  • Why use R as your GIS?

Find the presentation at elipousson.github.io/presentation-tugis2023/ or the code at github.com/elipousson/presentation-tugis2023

1.2 Where am I coming from?

A few things about me:

  • I work as a community planner at the Baltimore City Department of Planning
  • I started learning R as a student at the John Hopkins University School of Public Health in 2019
  • I started developing R packages while working at the Neighborhood Design Center in 2020
  • I teach using a course on spatial data and R for the UMBC GIS Certificate program

2 A Short Introduction to R

2.1 What is R?

R is best known for as a statistical programming language often used in data science and research.

Like Python or C++, R is an object-oriented, functional programming language where the base set of features can be extended through open-source packages or libraries.

R has been around a while 📜

R turns 30 years old this month 🎂

R is based an S—a language created at Bell Labs in 1976 to support exploratory data analysis.

Digital Equipment Corporation VAX 11/780 mainframe computer. Source: Boston Public Library Copyright Spencer Grant

R doesn’t use a graphical interface 🤖

R is growing in popularity 📈

R is growing in popularity 📈

The average number of monthly downloads for {sf} (the most popular package for working with spatial data in R) has grown from just 1,300 in June 2018 to over 58,000 in June 2023.

R is flexible 🛠️

There can’t be packages for everything—but sometimes it feels that way.

Packages let you work with everything from Microsoft Word documents to 3D renderings of digital elevation data to Google Drive to Google Earth Engine.

Animated loop showing a mountain with lighting from different directions.

Created with {rayshader}

2.2 How does R work with spatial data?

The {sf} package, first published in 2016, is the most popular R package for spatial data.

Extension packages include:

  • lwgeom for selected liblwgeom/PostGIS functions
  • stars for raster data, and raster or vector data cubes (spatial time series)
  • sfnetworks for geospatial network data

sf package logo

sfnetworks package logo

{sf} is built on open source libraries

Like QGIS, {sf} rests on a foundation of open source libraries:

  • SQLite (a C library that implements a SQL database engine),
  • GDAL (the Geospatial Data Abstraction Library),
  • PROJ (a coordinate transformation software library),
  • and GEOS (a C/C++ library for computational geometry).

GDAL logo

sf objects are data frames with “sticky” geometry

sf objects are an implementation of the simple feature standard.

Attribute data is attached to a “sticky” geometry column.

Illustration (c) 2018 by Allison Horst

2.3 Take a look at {sf} in action

Packages are simple to install:

install.packages(c("sf", "ggplot2", "dplyr"))

And to load:

library(sf)
library(ggplot2)
library(dplyr)

Using the {sf} package, you can use st_read() to read data into R from a local file, URL, or database:

md <- st_read("files/md_counties.gpkg")
Reading layer `md_counties' from data source 
  `/Users/elipousson/Desktop/presentation-tugis2023/files/md_counties.gpkg' 
  using driver `GPKG'
Simple feature collection with 24 features and 17 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8848525 ymin: 4563419 xmax: -8347435 ymax: 4825776
Projected CRS: WGS 84 / Pseudo-Mercator

You can take a look at the geometry column using st_geometry():

st_geometry(md)
Geometry set for 24 features 
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8848525 ymin: 4563419 xmax: -8347435 ymax: 4825776
Projected CRS: WGS 84 / Pseudo-Mercator
First 5 geometries:
MULTIPOLYGON (((-8380505 4582785, -8380646 4582...
MULTIPOLYGON (((-8772971 4809172, -8772981 4809...
MULTIPOLYGON (((-8539486 4774265, -8539486 4774...
MULTIPOLYGON (((-8434911 4785700, -8434909 4785...
MULTIPOLYGON (((-8537944 4825494, -8537927 4825...

Or, if you drop the geometry, a sf object is just like a spreadsheet:

st_drop_geometry(md)
   STATEFP COUNTYFP COUNTYNS GEOID            NAME               NAMELSAD LSAD
1       24      047 01668802 24047       Worcester       Worcester County   06
2       24      001 01713506 24001        Allegany        Allegany County   06
3       24      510 01702381 24510       Baltimore         Baltimore city   25
4       24      015 00596115 24015           Cecil           Cecil County   06
5       24      005 01695314 24005       Baltimore       Baltimore County   06
6       24      013 01696228 24013         Carroll         Carroll County   06
7       24      009 01676636 24009         Calvert         Calvert County   06
8       24      019 00596495 24019      Dorchester      Dorchester County   06
9       24      003 01710958 24003    Anne Arundel    Anne Arundel County   06
10      24      021 01711211 24021       Frederick       Frederick County   06
11      24      033 01714670 24033 Prince George's Prince George's County   06
12      24      027 01709077 24027          Howard          Howard County   06
13      24      023 01711058 24023         Garrett         Garrett County   06
14      24      031 01712500 24031      Montgomery      Montgomery County   06
15      24      035 00596089 24035    Queen Anne's    Queen Anne's County   06
16      24      041 00592947 24041          Talbot          Talbot County   06
17      24      045 01668606 24045        Wicomico        Wicomico County   06
18      24      039 00596907 24039        Somerset        Somerset County   06
19      24      025 01698178 24025         Harford         Harford County   06
20      24      037 01697853 24037      St. Mary's      St. Mary's County   06
21      24      029 00593907 24029            Kent            Kent County   06
22      24      017 01676992 24017         Charles         Charles County   06
23      24      043 01714220 24043      Washington      Washington County   06
24      24      011 00595737 24011        Caroline        Caroline County   06
   CLASSFP MTFCC CSAFP CBSAFP METDIVFP FUNCSTAT      ALAND     AWATER
1       H1 G4020   480  41540     <NA>        A 1213156434  586531107
2       H1 G4020  <NA>  19060     <NA>        A 1093489884   14710932
3       C7 G4020   548  12580     <NA>        F  209649327   28758743
4       H1 G4020   428  37980    48864        A  896912533  185281256
5       H1 G4020   548  12580     <NA>        A 1549740652  215957832
6       H1 G4020   548  12580     <NA>        A 1159355859   13112464
7       H1 G4020   548  47900    47894        A  552158542  341580668
8       H1 G4020   480  15700     <NA>        A 1400573746 1145353068
9       H1 G4020   548  12580     <NA>        A 1074353889  448032843
10      H1 G4020   548  47900    23224        A 1710922224   17674121
11      H1 G4020   548  47900    47894        A 1250057213   41922695
12      H1 G4020   548  12580     <NA>        A  649956423    6336170
13      H1 G4020  <NA>   <NA>     <NA>        A 1681102437   22498420
14      H1 G4020   548  47900    23224        A 1277193339   35686502
15      H1 G4020   548  12580     <NA>        A  962673214  360020725
16      H1 G4020   548  20660     <NA>        A  695561637  539363457
17      H1 G4020   480  41540     <NA>        A  969767208   66764613
18      H1 G4020   480  41540     <NA>        A  828145301  752652883
19      H1 G4020   548  12580     <NA>        A 1132152044  231885675
20      H1 G4020   548  15680     <NA>        A  928809940 1050592561
21      H1 G4020  <NA>   <NA>     <NA>        A  717497117  353321619
22      H1 G4020   548  47900    47894        A 1185757843  479438830
23      H1 G4020   548  25180     <NA>        A 1185655248   24820607
24      H1 G4020  <NA>   <NA>     <NA>        A  827350254   16777066
      INTPTLAT     INTPTLON
1  +38.2221332 -075.3099315
2  +39.6123134 -078.7031037
3  +39.3000324 -076.6104761
4  +39.5623537 -075.9415852
5  +39.4431666 -076.6165693
6  +39.5633280 -077.0153297
7  +38.5227191 -076.5297621
8  +38.4291957 -076.0474333
9  +38.9916174 -076.5608941
10 +39.4701773 -077.3976358
11 +38.8292778 -076.8481880
12 +39.2522639 -076.9244057
13 +39.5472985 -079.2746192
14 +39.1373815 -077.2030633
15 +39.0406929 -076.0824053
16 +38.7483486 -076.1784757
17 +38.3673699 -075.6320828
18 +38.0744501 -075.8533228
19 +39.5374292 -076.2997894
20 +38.2230765 -076.5344870
21 +39.2412793 -076.1259867
22 +38.4728532 -077.0154272
23 +39.6036207 -077.8146709
24 +38.8715306 -075.8316620

You can plot data with R’s built-in plot() function:

plot(md[, 2])

You can summarize data using the summary() function:

summary(md)
   STATEFP            COUNTYFP           COUNTYNS            GEOID          
 Length:24          Length:24          Length:24          Length:24         
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
     NAME             NAMELSAD             LSAD             CLASSFP         
 Length:24          Length:24          Length:24          Length:24         
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
    MTFCC              CSAFP              CBSAFP            METDIVFP        
 Length:24          Length:24          Length:24          Length:24         
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
   FUNCSTAT             ALAND               AWATER            INTPTLAT        
 Length:24          Min.   :2.096e+08   Min.   :6.336e+06   Length:24         
 Class :character   1st Qu.:8.279e+08   1st Qu.:2.424e+07   Class :character  
 Mode  :character   Median :1.084e+09   Median :2.006e+08   Mode  :character  
                    Mean   :1.048e+09   Mean   :2.908e+08                     
                    3rd Qu.:1.222e+09   3rd Qu.:4.559e+08                     
                    Max.   :1.711e+09   Max.   :1.145e+09                     
   INTPTLON                    geom   
 Length:24          MULTIPOLYGON :24  
 Class :character   epsg:3857    : 0  
 Mode  :character   +proj=merc...: 0  
                                      
                                      
                                      

Using {sf} with tidyverse packages

The tidyverse family of R packages developed by Posit (formerly known as RStudio) including:

  • ggplot2 for making graphics and data visualizations from bar charts to box plots to maps
  • dplyr for common data manipulation challenges, such as filtering, re-arranging, or summarizing data
  • readr for reading rectangular data in a fast and friendly way

Tidyverse packages

tidyverse packages work well with {sf}. For example, we can use geom_sf() from {ggplot2} to make a simple map:

md_map <- ggplot(data = md) +
  geom_sf() +
  theme_minimal()

md_map

Or take a peek at the values of the data with the glimpse() function from {dplyr}:

glimpse(md)
Rows: 24
Columns: 18
$ STATEFP  <chr> "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "…
$ COUNTYFP <chr> "047", "001", "510", "015", "005", "013", "009", "019", "003"…
$ COUNTYNS <chr> "01668802", "01713506", "01702381", "00596115", "01695314", "…
$ GEOID    <chr> "24047", "24001", "24510", "24015", "24005", "24013", "24009"…
$ NAME     <chr> "Worcester", "Allegany", "Baltimore", "Cecil", "Baltimore", "…
$ NAMELSAD <chr> "Worcester County", "Allegany County", "Baltimore city", "Cec…
$ LSAD     <chr> "06", "06", "25", "06", "06", "06", "06", "06", "06", "06", "…
$ CLASSFP  <chr> "H1", "H1", "C7", "H1", "H1", "H1", "H1", "H1", "H1", "H1", "…
$ MTFCC    <chr> "G4020", "G4020", "G4020", "G4020", "G4020", "G4020", "G4020"…
$ CSAFP    <chr> "480", NA, "548", "428", "548", "548", "548", "480", "548", "…
$ CBSAFP   <chr> "41540", "19060", "12580", "37980", "12580", "12580", "47900"…
$ METDIVFP <chr> NA, NA, NA, "48864", NA, NA, "47894", NA, NA, "23224", "47894…
$ FUNCSTAT <chr> "A", "A", "F", "A", "A", "A", "A", "A", "A", "A", "A", "A", "…
$ ALAND    <dbl> 1213156434, 1093489884, 209649327, 896912533, 1549740652, 115…
$ AWATER   <dbl> 586531107, 14710932, 28758743, 185281256, 215957832, 13112464…
$ INTPTLAT <chr> "+38.2221332", "+39.6123134", "+39.3000324", "+39.5623537", "…
$ INTPTLON <chr> "-075.3099315", "-078.7031037", "-076.6104761", "-075.9415852…
$ geom     <MULTIPOLYGON [m]> MULTIPOLYGON (((-8380505 45..., MULTIPOLYGON (((…

Using {sf} for spatial transformations

{sf} includes a range of spatial transformation functions with names matching the spatial functions of PostGIS.

To show how this works, we can get the center of Baltimore City, buffer by 25 miles, and filter intersecting counties.

ST_intersects relationships with different geometry types

Using {sf} for spatial transformations

balt_center <- st_centroid(md[3, ])

balt_center
Simple feature collection with 1 feature and 17 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -8528259 ymin: 4764882 xmax: -8528259 ymax: 4764882
Projected CRS: WGS 84 / Pseudo-Mercator
  STATEFP COUNTYFP COUNTYNS GEOID      NAME       NAMELSAD LSAD CLASSFP MTFCC
3      24      510 01702381 24510 Baltimore Baltimore city   25      C7 G4020
  CSAFP CBSAFP METDIVFP FUNCSTAT     ALAND   AWATER    INTPTLAT     INTPTLON
3   548  12580     <NA>        F 209649327 28758743 +39.3000324 -076.6104761
                      geom
3 POINT (-8528259 4764882)

Using {sf} for spatial transformations

balt_center <- st_centroid(md[3, ])

balt_area <- st_buffer(balt_center, dist = as_units(25, "mi"))

balt_area
Simple feature collection with 1 feature and 17 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -8568493 ymin: 4724649 xmax: -8488026 ymax: 4805116
Projected CRS: WGS 84 / Pseudo-Mercator
  STATEFP COUNTYFP COUNTYNS GEOID      NAME       NAMELSAD LSAD CLASSFP MTFCC
3      24      510 01702381 24510 Baltimore Baltimore city   25      C7 G4020
  CSAFP CBSAFP METDIVFP FUNCSTAT     ALAND   AWATER    INTPTLAT     INTPTLON
3   548  12580     <NA>        F 209649327 28758743 +39.3000324 -076.6104761
                            geom
3 POLYGON ((-8488026 4764882,...

Using {sf} for spatial transformations

balt_center <- st_centroid(md[3, ])

balt_area <- st_buffer(balt_center, dist = as_units(25, "mi"))

balt_area_counties <- st_filter(md, balt_area, .predicate = st_intersects)

balt_area_counties
Simple feature collection with 9 features and 17 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8630342 ymin: 4655334 xmax: -8433120 ymax: 4825554
Projected CRS: WGS 84 / Pseudo-Mercator
  STATEFP COUNTYFP COUNTYNS GEOID            NAME               NAMELSAD LSAD
1      24      510 01702381 24510       Baltimore         Baltimore city   25
2      24      005 01695314 24005       Baltimore       Baltimore County   06
3      24      013 01696228 24013         Carroll         Carroll County   06
4      24      003 01710958 24003    Anne Arundel    Anne Arundel County   06
5      24      033 01714670 24033 Prince George's Prince George's County   06
6      24      027 01709077 24027          Howard          Howard County   06
7      24      031 01712500 24031      Montgomery      Montgomery County   06
8      24      025 01698178 24025         Harford         Harford County   06
9      24      029 00593907 24029            Kent            Kent County   06
  CLASSFP MTFCC CSAFP CBSAFP METDIVFP FUNCSTAT      ALAND    AWATER    INTPTLAT
1      C7 G4020   548  12580     <NA>        F  209649327  28758743 +39.3000324
2      H1 G4020   548  12580     <NA>        A 1549740652 215957832 +39.4431666
3      H1 G4020   548  12580     <NA>        A 1159355859  13112464 +39.5633280
4      H1 G4020   548  12580     <NA>        A 1074353889 448032843 +38.9916174
5      H1 G4020   548  47900    47894        A 1250057213  41922695 +38.8292778
6      H1 G4020   548  12580     <NA>        A  649956423   6336170 +39.2522639
7      H1 G4020   548  47900    23224        A 1277193339  35686502 +39.1373815
8      H1 G4020   548  12580     <NA>        A 1132152044 231885675 +39.5374292
9      H1 G4020  <NA>   <NA>     <NA>        A  717497117 353321619 +39.2412793
      INTPTLON                           geom
1 -076.6104761 MULTIPOLYGON (((-8539486 47...
2 -076.6165693 MULTIPOLYGON (((-8537944 48...
3 -077.0153297 MULTIPOLYGON (((-8566905 48...
4 -076.5608941 MULTIPOLYGON (((-8524879 47...
5 -076.8481880 MULTIPOLYGON (((-8564237 47...
6 -076.9244057 MULTIPOLYGON (((-8545393 47...
7 -077.2030633 MULTIPOLYGON (((-8577825 47...
8 -076.2997894 MULTIPOLYGON (((-8483226 47...
9 -076.1259867 MULTIPOLYGON (((-8479204 47...

Using {sf} for spatial transformations

ggplot(data = md) +
  geom_sf() +
  geom_sf(data = balt_area_counties, fill = "darkorange", alpha = 0.4) +
  geom_sf(data = balt_area, color = "orange", fill = NA, linewidth = 1) +
  theme_void()

Using {sf} for spatial transformations

Using {dplyr} for attribute transformations

Changes to attribute data with functions like summarise() also change geometry:

md_AWATER_summary <- md |> 
  mutate(
    perc_AWATER = round(AWATER / (ALAND + AWATER), digits = 2),
    category = case_when(
      perc_AWATER > 0.4 ~ "A lot of water",
      perc_AWATER > 0.1 ~ "Some water",
      .default = "Not much water"
    ),
    category = factor(category, c("A lot of water", "Some water", "Not much water"))
  ) |> 
  group_by(category) |> 
  summarise(
    mean_perc_AWATER = mean(perc_AWATER)
  ) |> 
  arrange(category)

glimpse(md_AWATER_summary)
Rows: 3
Columns: 3
$ category         <fct> A lot of water, Some water, Not much water
$ mean_perc_AWATER <dbl> 0.475, 0.247, 0.021
$ geom             <GEOMETRY [m]> POLYGON ((-8438155 4567026,..., MULTIPOLYGON (((-8384…

Additional packages for spatial data

{sf} is not the only package for working with spatial data! Developers are both extending {sf} and using alternate approaches to support:

  • Mapping and visualization
  • Downloading data from different sources
  • Integration with other applications
  • Changing the geometry or attributes of data

Interactive mapping with mapview()

You can view data interactively with {mapview} and {leaflet} (built on top of the Leaflet Javascript library):

library(mapview)

mapview(md_AWATER_summary, zcol = "category")

Downloading data from OpenStreetMap

You can download data from OpenStreetMap by using the {osmdata} package to access the Overpass API:

library(osmdata)

balt_area_schools_query <- balt_area |>
  st_transform(4326) |>
  st_bbox() |>
  opq() |>
  add_osm_features(
    features = c(
      "amenity" = "university",
      "amenity" = "college"
    )
  )

balt_area_schools <- osmdata_sf(balt_area_schools_query)

The {osmdata} package returns a list with the original query and the different geometry types extracted from OpenStreetMap:

str(balt_area_schools)
List of 8
 $ bbox             : chr "39.0207808733236,-76.9720808621772,39.5801417319894,-76.2492317058742"
 $ overpass_call    : chr "[out:xml][timeout:25];\n(\n  node [\"amenity\"=\"university\"] (39.0207808733236,-76.9720808621772,39.580141731"| __truncated__
 $ meta             :List of 3
  ..$ timestamp       : chr "[ Tue 3 Aug 2023 18:38:15 ]"
  ..$ OSM_version     : chr "0.6"
  ..$ overpass_version: chr "Overpass API 0.7.61.4 df4c946a"
 $ osm_points       :Classes 'sf' and 'data.frame': 2793 obs. of  28 variables:
  ..$ osm_id               : chr [1:2793] "37522081" "49391514" "50379933" "50621967" ...
  ..$ name                 : chr [1:2793] NA NA NA NA ...
  ..$ addr:city            : chr [1:2793] NA NA NA NA ...
  ..$ addr:housenumber     : chr [1:2793] NA NA NA NA ...
  ..$ addr:postcode        : chr [1:2793] NA NA NA NA ...
  ..$ addr:state           : chr [1:2793] NA NA NA NA ...
  ..$ addr:street          : chr [1:2793] NA NA NA NA ...
  ..$ amenity              : chr [1:2793] NA NA NA NA ...
  ..$ barrier              : chr [1:2793] NA NA NA NA ...
  ..$ button_operated      : chr [1:2793] NA NA NA NA ...
  ..$ check_date           : chr [1:2793] NA NA NA NA ...
  ..$ crossing             : chr [1:2793] NA NA NA NA ...
  ..$ ele                  : chr [1:2793] NA NA NA NA ...
  ..$ entrance             : chr [1:2793] NA NA NA NA ...
  ..$ gnis:county_id       : chr [1:2793] NA NA NA NA ...
  ..$ gnis:created         : chr [1:2793] NA NA NA NA ...
  ..$ gnis:edited          : chr [1:2793] NA NA NA NA ...
  ..$ gnis:feature_id      : chr [1:2793] NA NA NA NA ...
  ..$ gnis:state_id        : chr [1:2793] NA NA NA NA ...
  ..$ highway              : chr [1:2793] NA NA NA NA ...
  ..$ internet_access      : chr [1:2793] NA NA NA NA ...
  ..$ kerb                 : chr [1:2793] NA NA NA NA ...
  ..$ motorcar             : chr [1:2793] NA NA NA NA ...
  ..$ motorcycle           : chr [1:2793] NA NA NA NA ...
  ..$ operator             : chr [1:2793] NA NA NA NA ...
  ..$ short_name           : chr [1:2793] NA NA NA NA ...
  ..$ traffic_signals:sound: chr [1:2793] NA NA NA NA ...
  ..$ geometry             :sfc_POINT of length 2793; first list element:  'XY' num [1:2] -76.5 39.3
  ..- attr(*, "sf_column")= chr "geometry"
  ..- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  .. ..- attr(*, "names")= chr [1:27] "osm_id" "name" "addr:city" "addr:housenumber" ...
 $ osm_lines        :Classes 'sf' and 'data.frame': 34 obs. of  40 variables:
  ..$ osm_id             : chr [1:34] "31030343" "31274489" "114709354" "235563773" ...
  ..$ name               : chr [1:34] NA NA NA NA ...
  ..$ access             : chr [1:34] NA NA NA NA ...
  ..$ addr:city          : chr [1:34] NA NA NA NA ...
  ..$ addr:country       : chr [1:34] NA NA NA NA ...
  ..$ addr:housenumber   : chr [1:34] NA NA NA NA ...
  ..$ addr:postcode      : chr [1:34] NA NA NA NA ...
  ..$ addr:state         : chr [1:34] NA NA NA NA ...
  ..$ addr:street        : chr [1:34] NA NA NA NA ...
  ..$ amenity            : chr [1:34] NA NA NA NA ...
  ..$ barrier            : chr [1:34] NA NA NA NA ...
  ..$ building           : chr [1:34] NA NA NA NA ...
  ..$ building:levels    : chr [1:34] NA NA NA NA ...
  ..$ check_date         : chr [1:34] NA NA NA NA ...
  ..$ denomination       : chr [1:34] NA NA NA NA ...
  ..$ ele                : chr [1:34] NA NA NA NA ...
  ..$ fence_type         : chr [1:34] NA NA NA NA ...
  ..$ gnis:county_id     : chr [1:34] NA NA NA NA ...
  ..$ gnis:county_name   : chr [1:34] NA NA NA NA ...
  ..$ gnis:created       : chr [1:34] NA NA NA NA ...
  ..$ gnis:edited        : chr [1:34] NA NA NA NA ...
  ..$ gnis:feature_id    : chr [1:34] NA NA NA NA ...
  ..$ gnis:import_uuid   : chr [1:34] NA NA NA NA ...
  ..$ gnis:reviewed      : chr [1:34] NA NA NA NA ...
  ..$ gnis:state_id      : chr [1:34] NA NA NA NA ...
  ..$ internet_access    : chr [1:34] NA NA NA NA ...
  ..$ internet_access:fee: chr [1:34] NA NA NA NA ...
  ..$ name_1             : chr [1:34] NA NA NA NA ...
  ..$ old_name           : chr [1:34] NA NA NA NA ...
  ..$ operator           : chr [1:34] NA NA NA NA ...
  ..$ operator:wikidata  : chr [1:34] NA NA NA NA ...
  ..$ phone              : chr [1:34] NA NA NA NA ...
  ..$ religion           : chr [1:34] NA NA NA NA ...
  ..$ short_name         : chr [1:34] NA NA NA NA ...
  ..$ source             : chr [1:34] NA NA NA NA ...
  ..$ surface            : chr [1:34] NA NA NA NA ...
  ..$ website            : chr [1:34] NA NA NA NA ...
  ..$ wikidata           : chr [1:34] NA NA NA NA ...
  ..$ wikipedia          : chr [1:34] NA NA NA NA ...
  ..$ geometry           :sfc_LINESTRING of length 34; first list element:  'XY' num [1:84, 1:2] -76.7 -76.7 -76.7 -76.7 -76.7 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:84] "364705495" "2821804114" "2821804115" "2821804116" ...
  .. .. ..$ : chr [1:2] "lon" "lat"
  ..- attr(*, "sf_column")= chr "geometry"
  ..- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  .. ..- attr(*, "names")= chr [1:39] "osm_id" "name" "access" "addr:city" ...
 $ osm_polygons     :Classes 'sf' and 'data.frame': 85 obs. of  40 variables:
  ..$ osm_id             : chr [1:85] "31119729" "31119744" "82222563" "86580064" ...
  ..$ name               : chr [1:85] "Community College of Baltimore County: Catonsville" "Community College of Baltimore County: Essex" "Community College Of Baltimore County Hunt Valley Extension" "Anne Arundel Community College - Arundel Mills" ...
  ..$ access             : chr [1:85] NA NA NA NA ...
  ..$ addr:city          : chr [1:85] "Baltimore" "Baltimore" "Hunt Valley" NA ...
  ..$ addr:country       : chr [1:85] NA NA NA NA ...
  ..$ addr:housenumber   : chr [1:85] "800" "7201" "11101" NA ...
  ..$ addr:postcode      : chr [1:85] "21228" "21237" "21031" NA ...
  ..$ addr:state         : chr [1:85] "MD" NA NA NA ...
  ..$ addr:street        : chr [1:85] "South Rolling Road" "Rossville Boulevard" "McCormick Road" NA ...
  ..$ amenity            : chr [1:85] "college" "college" "college" "college" ...
  ..$ barrier            : chr [1:85] NA NA NA NA ...
  ..$ building           : chr [1:85] NA NA "yes" NA ...
  ..$ building:levels    : chr [1:85] NA NA NA NA ...
  ..$ check_date         : chr [1:85] NA NA NA NA ...
  ..$ denomination       : chr [1:85] NA NA NA NA ...
  ..$ ele                : chr [1:85] NA NA NA "50" ...
  ..$ fence_type         : chr [1:85] NA NA NA NA ...
  ..$ gnis:county_id     : chr [1:85] NA NA NA "003" ...
  ..$ gnis:county_name   : chr [1:85] NA NA NA NA ...
  ..$ gnis:created       : chr [1:85] NA NA NA "08/24/2007" ...
  ..$ gnis:edited        : chr [1:85] NA NA NA NA ...
  ..$ gnis:feature_id    : chr [1:85] "592063" "592133" NA "2132656" ...
  ..$ gnis:import_uuid   : chr [1:85] NA NA NA NA ...
  ..$ gnis:reviewed      : chr [1:85] NA NA NA NA ...
  ..$ gnis:state_id      : chr [1:85] NA NA NA "24" ...
  ..$ internet_access    : chr [1:85] NA NA NA NA ...
  ..$ internet_access:fee: chr [1:85] NA NA NA NA ...
  ..$ name_1             : chr [1:85] NA NA NA NA ...
  ..$ old_name           : chr [1:85] NA NA NA NA ...
  ..$ operator           : chr [1:85] NA NA NA NA ...
  ..$ operator:wikidata  : chr [1:85] NA NA NA NA ...
  ..$ phone              : chr [1:85] NA "+1 443 840 2222" "+1-443-840-5830" NA ...
  ..$ religion           : chr [1:85] NA NA NA NA ...
  ..$ short_name         : chr [1:85] NA NA "CCBC - Hunt Valley Ext" NA ...
  ..$ source             : chr [1:85] NA NA NA NA ...
  ..$ surface            : chr [1:85] NA NA NA NA ...
  ..$ website            : chr [1:85] "www.ccbcmd.edu" "http://www.ccbcmd.edu/About-CCBC/Locations/CCBC-Essex.aspx" "http://www.ccbcmd.edu/" NA ...
  ..$ wikidata           : chr [1:85] NA NA NA NA ...
  ..$ wikipedia          : chr [1:85] NA NA NA NA ...
  ..$ geometry           :sfc_POLYGON of length 85; first list element: List of 1
  .. ..$ : num [1:68, 1:2] -76.7 -76.7 -76.7 -76.7 -76.7 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:68] "345874523" "6390816250" "6390816249" "345874525" ...
  .. .. .. ..$ : chr [1:2] "lon" "lat"
  .. ..- attr(*, "class")= chr [1:3] "XY" "POLYGON" "sfg"
  ..- attr(*, "sf_column")= chr "geometry"
  ..- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  .. ..- attr(*, "names")= chr [1:39] "osm_id" "name" "access" "addr:city" ...
 $ osm_multilines   : NULL
 $ osm_multipolygons:Classes 'sf' and 'data.frame': 11 obs. of  30 variables:
  ..$ osm_id              : chr [1:11] "1599827" "3170935" "6699380" "6849272" ...
  ..$ name                : chr [1:11] "Loyola University Maryland" "Towson University" "Stevenson University Owings Mills Campus" "The Peabody Institute of Johns Hopkins" ...
  ..$ addr:city           : chr [1:11] "Baltimore" "Towson" "" "" ...
  ..$ addr:housenumber    : chr [1:11] "4501" "8000" "" "" ...
  ..$ addr:postcode       : chr [1:11] "21210" "21204" "" "" ...
  ..$ addr:state          : chr [1:11] "" "MD" "" "" ...
  ..$ addr:street         : chr [1:11] "North Charles Street" "York Road" "" "" ...
  ..$ alt_name            : chr [1:11] "" "" "" "" ...
  ..$ amenity             : chr [1:11] "university" "university" "university" "university" ...
  ..$ check_date          : chr [1:11] "" "" "" "2022-05-13" ...
  ..$ ele                 : chr [1:11] "99" "" "" "24" ...
  ..$ gnis:county_id      : chr [1:11] "" "" "" "" ...
  ..$ gnis:created        : chr [1:11] "" "" "" "" ...
  ..$ gnis:feature_id     : chr [1:11] "" "" "" "" ...
  ..$ gnis:state_id       : chr [1:11] "" "" "" "" ...
  ..$ internet_access     : chr [1:11] "" "" "" "" ...
  ..$ internet_access:fee : chr [1:11] "" "" "" "" ...
  ..$ internet_access:ssid: chr [1:11] "" "" "" "" ...
  ..$ isced:level:2011    : chr [1:11] "" "" "" "" ...
  ..$ name:en             : chr [1:11] "" "" "" "" ...
  ..$ operator            : chr [1:11] "" "" "Stevenson University" "Johns Hopkins University" ...
  ..$ operator:short      : chr [1:11] "" "" "" "JHU" ...
  ..$ operator:wikidata   : chr [1:11] "" "" "" "" ...
  ..$ phone               : chr [1:11] "" "" "" "" ...
  ..$ short_name          : chr [1:11] "" "" "" "" ...
  ..$ type                : chr [1:11] "multipolygon" "multipolygon" "multipolygon" "multipolygon" ...
  ..$ website             : chr [1:11] "https://www.loyola.edu/" "" "" "" ...
  ..$ wikidata            : chr [1:11] "Q5399470" "Q1474129" "" "Q1156934" ...
  ..$ wikipedia           : chr [1:11] "en:Loyola University Maryland" "" "" "" ...
  ..$ geometry            :sfc_MULTIPOLYGON of length 11; first list element: List of 1
  .. ..$ :List of 6
  .. .. ..$ 114709358: num [1:65, 1:2] -76.6 -76.6 -76.6 -76.6 -76.6 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:65] "9823629897" "9823720659" "6303560379" "6303560374" ...
  .. .. .. .. ..$ : chr [1:2] "lat" "lon"
  .. .. ..$ 114709360: num [1:17, 1:2] -76.6 -76.6 -76.6 -76.6 -76.6 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:17] "1298673031" "1298673121" "9823712899" "9823712900" ...
  .. .. .. .. ..$ : chr [1:2] "lat" "lon"
  .. .. ..$ 673113087: num [1:8, 1:2] -76.6 -76.6 -76.6 -76.6 -76.6 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:8] "6303560373" "6303560372" "6303560371" "6303560370" ...
  .. .. .. .. ..$ : chr [1:2] "lat" "lon"
  .. .. ..$ 673113079: num [1:18, 1:2] -76.6 -76.6 -76.6 -76.6 -76.6 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:18] "6303560352" "6303560364" "6303560362" "6303560361" ...
  .. .. .. .. ..$ : chr [1:2] "lat" "lon"
  .. .. ..$ 673113549: num [1:6, 1:2] -76.6 -76.6 -76.6 -76.6 -76.6 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:6] "6303565073" "9823720645" "6303565076" "6303565075" ...
  .. .. .. .. ..$ : chr [1:2] "lat" "lon"
  .. .. ..$ 673113545: num [1:6, 1:2] -76.6 -76.6 -76.6 -76.6 -76.6 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:6] "6303565069" "6303565072" "9823720636" "6303565071" ...
  .. .. .. .. ..$ : chr [1:2] "lat" "lon"
  .. ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
  ..- attr(*, "sf_column")= chr "geometry"
  ..- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  .. ..- attr(*, "names")= chr [1:29] "osm_id" "name" "addr:city" "addr:housenumber" ...
 - attr(*, "class")= chr [1:3] "list" "osmdata" "osmdata_sf"

Integration with ArcGIS Pro or QGIS

If your desktop GIS does something that you don’t know how to do in R, you can even interface directly using:

You can also use the {esri2sf} package to download FeatureLayer data and metadata.

3 Using R at the Department of Planning

3.1 About the INSPIRE Program

INSPIRE stands for Investing in Neighborhoods and Schools to Promote Improvement, Revitalization, and Excellence.

The Department of Planning is working with communities in a quarter-mile area (about a five minute walk) around each 21st Century School to create a plan that can guide new private and public investments in each area.

3.2 Using R for the INSPIRE Program

3.3 What do I do with R?

I use R for a little bit of everything:

  • Process survey or observation data collected in the “field”
  • Summarize and combine data from spatial and non-spatial data sources
  • Create maps and tables for presentations and hand-outs (often in combination with Adobe Illustrator or Adobe InDesign)
  • Extract data from or add content to documents (see the {officer} and {officerExtras} package)

3.4 What do I do with R?

This work relies on a set of custom packages {mapbaltimore}, {bcpss}, and {mapmaryland} with city and state-specific data and data access functions.

Using the {mapbaltimore} package

Loading the {mapbaltimore} package gives you access to:

  • Neighborhoods
  • Administrative and political boundaries
  • INSPIRE plan area boundaries and much more

Using the {mapbaltimore} package

inspire_plans is one of dozens of datasets that are available after the package is loaded:

# pak::pkg_install("elipousson/mapbaltimore")
library(mapbaltimore)

inspire_plans <- st_transform(inspire_plans, crs = 3857)

area <- inspire_plans[5, ]

Using the {mapbaltimore} package

The package includes “helper” functions like get_area_streets() to filter data by name or location:

area_streets <- get_area_streets(area)

area_streets <- st_transform(area_streets, crs = 3857)

area_streets
Simple feature collection with 40 features and 1 field
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: -8526277 ymin: 4763286 xmax: -8525230 ymax: 4764336
Projected CRS: WGS 84 / Pseudo-Mercator
# A tibble: 40 × 2
   fullname                                                             geometry
 * <chr>                                                          <GEOMETRY [m]>
 1 BOYER ST        MULTILINESTRING ((-8525452 4763491, -8525403 4763493, -85253…
 2 CONGRESS CT                   LINESTRING (-8526276 4763464, -8526264 4763464)
 3 E BALTIMORE ST  MULTILINESTRING ((-8525609 4763559, -8525571 4763561), (-852…
 4 E FAIRMOUNT AVE MULTILINESTRING ((-8526123 4763690, -8526068 4763693), (-852…
 5 E FAYETTE ST    MULTILINESTRING ((-8526029 4763885, -8525948 4763889, -85258…
 6 E LOMBARD ST    MULTILINESTRING ((-8526075 4763368, -8526049 4763369, -85260…
 7 IRVINE PL       LINESTRING (-8526266 4763621, -8526267 4763644, -8526269 476…
 8 JEFFERSON ST    MULTILINESTRING ((-8525488 4764246, -8525457 4764247), (-852…
 9 LAMLEY ST       MULTILINESTRING ((-8525552 4763798, -8525528 4763800, -85254…
10 MOYER ST        MULTILINESTRING ((-8525639 4763641, -8525633 4763642), (-852…
# ℹ 30 more rows

Using the {mapbaltimore} package

Check out my GitHub profile for information on these packages.

3.5 Using R for the Commodore John Rodgers EMS INSPIRE Plan

INSPIRE supports improvements along school walking routes but we first need to figure out:

  • Where are safety improvements needed?
  • What routes do students take to school?

Where are safety improvements needed?

The {mapmaryland} package includes a function to download and format data from the state open data portal for vehicle crashes:

library(mapmaryland)

area_crashes <- get_md_crash_data(area)

area_crashes <- format_md_crash_data(area_crashes)

area_crashes

Where are safety improvements needed?

The dplyr::filter() function limits data to injury crashes:

area_crashes <- st_as_sf(area_crashes, coords = c("longitude", "latitude"))
st_crs(area_crashes) <- 4326

injury_crashes <- filter(area_crashes, report_type == "Injury Crash")

injury_crashes
Simple feature collection with 405 features and 43 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -76.59249 ymin: 39.28995 xmax: -76.58349 ymax: 39.29714
Geodetic CRS:  WGS 84
# A tibble: 405 × 44
   crash_date dotw  weekend  year quarter light_desc     county_desc   county_no
 * <date>     <chr> <chr>   <dbl> <chr>   <chr>          <chr>             <dbl>
 1 2018-05-25 Fri   N        2018 Q2      Daylight       Baltimore Ci…        24
 2 2016-08-15 Mon   N        2016 Q3      Daylight       Baltimore Ci…        24
 3 2019-01-07 Mon   N        2019 Q1      Daylight       Baltimore Ci…        24
 4 2016-07-24 Sun   Y        2016 Q3      Dark Lights On Baltimore Ci…        24
 5 2018-12-29 Sat   Y        2018 Q4      Daylight       Baltimore Ci…        24
 6 2016-07-07 Thu   N        2016 Q3      Daylight       Baltimore Ci…        24
 7 2017-11-28 Tue   N        2017 Q4      Dark Lights On Baltimore Ci…        24
 8 2017-10-30 Mon   N        2017 Q4      Daylight       Baltimore Ci…        24
 9 2016-10-15 Sat   Y        2016 Q4      Dark Lights On Baltimore Ci…        24
10 2016-02-12 Fri   N        2016 Q1      <NA>           Baltimore Ci…        24
# ℹ 395 more rows
# ℹ 36 more variables: junction_desc <chr>, collision_type_desc <chr>,
#   surf_cond_desc <chr>, rd_cond_desc <chr>, rd_div_desc <chr>,
#   fix_obj_desc <chr>, report_no <chr>, acc_date <dbl>, acc_time <time>,
#   signal_flag_desc <chr>, signal_flag <chr>, c_m_zone_flag <chr>,
#   harm_event_code1 <dbl>, harm_event_desc2 <chr>, harm_event_code2 <dbl>,
#   rte_no <dbl>, log_mile <dbl>, logmile_dir_flag_desc <chr>, …

Where are safety improvements needed?

Mapping pedestrian injury crashes help identify priority intersections:

ggplot(data = injury_crashes) +
  geom_sf(data = area, fill = NA) +
  geom_sf(data = area_streets) +
  geom_sf(aes(color = ped_injury), alpha = 0.75, size = 2) +
  scale_color_viridis_d() +
  theme_void()

Where are safety improvements needed?

Where are safety improvements needed?

We included crash data on a packet of printed maps helped we used during a walk around the neighborhood to identify priority locations.

Eli Pousson and Imani Jasper (Eastern District Planner)

What routes do students take to school?

Data from BCPSS on student addresses supported the creation of maps showing the distribtion of student addresses.

What are the routes that students take to school?

Poster sized map showing 10 minute walking distance to collect feedback from school community members at Commodore John Rodgers Elementary/Middle School

Using R for the John Ruhrah EMS INSPIRE Plan

John Ruhrah Elementary/Middle School INSPIRE Plan

The INSPIRE program is creating a community plan for each INSPIRE area.

For each plan, we need to know: what are the area demographics and housing conditions?

What are the area demographics and housing conditions?

How do you know who lives in the planning area?

Here are the key steps:

  • Download American Community Survey (ACS) data from the U.S. Census Bureau
  • Interpolate tract-level geometry for a planning area
  • Combine planning area ACS data with county- and metro area-level data
  • Create a formatted table of ACS data

To start, we can get the area from the inspire_plans data (and use mutate() add a new column we’ll need shortly):

area <- inspire_plans[13, ]
area <- mutate(area, NAME = plan_name_short)

area
Simple feature collection with 1 feature and 22 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8522284 ymin: 4761947 xmax: -8521246 ymax: 4762989
Projected CRS: WGS 84 / Pseudo-Mercator
# A tibble: 1 × 23
  plan_name         plan_name_short overall_status inspire_lead_planner plan_url
* <chr>             <chr>           <chr>          <chr>                <chr>   
1 John Ruhrah Elem… John Ruhrah EMS Review in pro… Eli Pousson          https:/…
# ℹ 18 more variables: year_launched <chr>, year_adopted <chr>,
#   adoption_status <chr>, adoption_date <chr>, document_url <chr>,
#   recommendation_report_status <chr>, recommendation_report_url <chr>,
#   kick_off_presentation_date <chr>, launch_date_target <chr>,
#   walking_route_id_target_date <chr>, recommendations_date_target <chr>,
#   commission_review_date_target <chr>, implementation_status <chr>,
#   planning_districts <chr>, neighborhoods <chr>, council_districts <chr>, …

We can use {tidycensus} package download American Community Survey data directly from the Census API:

acs_data <- get_acs(
  geography = "tract",
  table = "B25105",
  county = "Baltimore city",
  state = "MD",
  year = 2021,
  survey = "acs5",
  geometry = TRUE
)

acs_data <- st_transform(acs_data, 3857)

acs_data
Simple feature collection with 199 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8539487 ymin: 4749959 xmax: -8519220 ymax: 4775128
Projected CRS: WGS 84 / Pseudo-Mercator
First 10 features:
         GEOID                                           NAME   variable
1  24510160300    Census Tract 1603, Baltimore city, Maryland B25105_001
2  24510020100     Census Tract 201, Baltimore city, Maryland B25105_001
3  24510190300    Census Tract 1903, Baltimore city, Maryland B25105_001
4  24510271801 Census Tract 2718.01, Baltimore city, Maryland B25105_001
5  24510272007 Census Tract 2720.07, Baltimore city, Maryland B25105_001
6  24510240200    Census Tract 2402, Baltimore city, Maryland B25105_001
7  24510270903 Census Tract 2709.03, Baltimore city, Maryland B25105_001
8  24510151100    Census Tract 1511, Baltimore city, Maryland B25105_001
9  24510160600    Census Tract 1606, Baltimore city, Maryland B25105_001
10 24510090400     Census Tract 904, Baltimore city, Maryland B25105_001
   estimate moe                       geometry
1       788 223 MULTIPOLYGON (((-8532207 47...
2      1780 231 MULTIPOLYGON (((-8526012 47...
3       589 342 MULTIPOLYGON (((-8532300 47...
4       531 107 MULTIPOLYGON (((-8536825 47...
5      1291  58 MULTIPOLYGON (((-8539487 47...
6      2198 151 MULTIPOLYGON (((-8528101 47...
7      1156 129 MULTIPOLYGON (((-8527043 47...
8      1221 151 MULTIPOLYGON (((-8535831 47...
9       870 187 MULTIPOLYGON (((-8534762 47...
10      959 110 MULTIPOLYGON (((-8528134 47...

We can use st_filter() to filter the data to our planning area:

mapview(list(area, st_filter(acs_data, area)))

The {getACS} package provides functions to weight tract-level estimates based on share of the block-level population or households that live.

Start by making a crosswalk between block and tract data:

block_xwalk <- make_block_xwalk("MD", "Baltimore city", crs = 3857)

block_xwalk
Simple feature collection with 10025 features and 28 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -8539487 ymin: 4749959 xmax: -8519220 ymax: 4775128
Projected CRS: WGS 84 / Pseudo-Mercator
First 10 features:
   STATEFP20 COUNTYFP20 TRACTCE20 BLOCKCE20         GEOID20     NAME20 MTFCC20
1         24        510    270401      4006 245102704014006 Block 4006   G5040
2         24        510    272007      1010 245102720071010 Block 1010   G5040
3         24        510    130200      1007 245101302001007 Block 1007   G5040
4         24        510    250500      3020 245102505003020 Block 3020   G5040
5         24        510    200100      1009 245102001001009 Block 1009   G5040
6         24        510    272003      1002 245102720031002 Block 1002   G5040
7         24        510    030200      1020 245100302001020 Block 1020   G5040
8         24        510    271300      3023 245102713003023 Block 3023   G5040
9         24        510    271501      2014 245102715012014 Block 2014   G5040
10        24        510    250500      1015 245102505001015 Block 1015   G5040
   UR20 UACE20 UATYPE20 FUNCSTAT20 ALAND20 AWATER20  INTPTLAT20   INTPTLON20
1     U  04843        U          S   19165        0 +39.3455590 -076.5541477
2     U  04843        U          S   29938        0 +39.3637168 -076.7028884
3     U  04843        U          S    1630        0 +39.3108913 -076.6240941
4     U  04843        U          S    5530        0 +39.2246859 -076.5891821
5     U  04843        U          S    4407        0 +39.2922426 -076.6466720
6     U  04843        U          S   24258        0 +39.3711849 -076.6789280
7     U  04843        U          S    8955        0 +39.2881509 -076.6032131
8     U  04843        U          S    9388        0 +39.3542857 -076.6369351
9     U  04843        U          S   31413        0 +39.3611131 -076.6655301
10    U  04843        U          S   65802        0 +39.2059241 -076.5759715
   HOUSING20 POP20 STATEFP COUNTYFP       GEOID    NAME             NAMELSAD
1         27    38      24      510 24510270401 2704.01 Census Tract 2704.01
2        166   314      24      510 24510272007 2720.07 Census Tract 2720.07
3          0     0      24      510 24510130200    1302    Census Tract 1302
4         19    35      24      510 24510250500    2505    Census Tract 2505
5         16    35      24      510 24510200100    2001    Census Tract 2001
6         22   118      24      510 24510272003 2720.03 Census Tract 2720.03
7         35    71      24      510 24510030200     302     Census Tract 302
8          3    21      24      510 24510271300    2713    Census Tract 2713
9         23    41      24      510 24510271501 2715.01 Census Tract 2715.01
10         0     0      24      510 24510250500    2505    Census Tract 2505
   MTFCC FUNCSTAT   ALAND  AWATER    INTPTLAT     INTPTLON
1  G5020        S 1457277       0 +39.3462013 -076.5467855
2  G5020        S  802298       0 +39.3636968 -076.7064881
3  G5020        S  352623       0 +39.3123177 -076.6305781
4  G5020        S 8584546 4070107 +39.2176239 -076.5823444
5  G5020        S  264900       0 +39.2909656 -076.6478189
6  G5020        S 1934714       0 +39.3633286 -076.6854724
7  G5020        S  506034   91912 +39.2865363 -076.6022612
8  G5020        S 2861713    3175 +39.3628280 -076.6395253
9  G5020        S 3628340   18282 +39.3642233 -076.6614760
10 G5020        S 8584546 4070107 +39.2176239 -076.5823444
                         geometry
1  POLYGON ((-8522055 4771161,...
2  POLYGON ((-8538657 4773870,...
3  POLYGON ((-8529841 4766300,...
4  POLYGON ((-8525952 4753874,...
5  POLYGON ((-8532324 4763585,...
6  POLYGON ((-8536016 4775034,...
7  POLYGON ((-8527463 4763015,...
8  POLYGON ((-8531243 4772484,...
9  POLYGON ((-8534527 4773513,...
10 POLYGON ((-8524756 4751349,...

Then make a crosswalk between the area and tracts:

area_xwalk <- make_area_xwalk(area, block_xwalk)

area_xwalk
# A tibble: 2 × 5
  GEOID       TRACTCE20 NAME            HOUSING20 perc_HOUSING20
  <chr>       <chr>     <chr>               <dbl>          <dbl>
1 24510260501 260501    John Ruhrah EMS       610              1
2 24510260700 260700    John Ruhrah EMS       600              1

We can then use the crosswalk to adjust the ACS data:

area_acs_data <- use_area_xwalk(acs_data, area_xwalk)

area_acs_data
# A tibble: 1 × 22
  NAME         variable column_id table_id estimate   moe perc_estimate perc_moe
  <chr>        <chr>    <chr>     <chr>       <dbl> <dbl>         <dbl>    <dbl>
1 John Ruhrah… B25105_… B25105001 B25105       2977   160            NA       NA
# ℹ 14 more variables: geography <chr>, table_title <chr>,
#   simple_table_title <chr>, subject_area <chr>, universe <chr>,
#   denominator_column_id <chr>, topics <chr>, line_number <dbl>,
#   column_title <chr>, indent <dbl>, parent_column_id <chr>,
#   denominator_estimate <dbl>, denominator_moe <dbl>,
#   denominator_column_title <chr>

But data for a single planning area doesn’t say much without context. {getACS} extends {tidycensus} to support downloading multiple geographies at once:

regional_acs_data <- get_acs_geographies(
  geography = c("county", "metropolitan statistical area/micropolitan statistical area"),
  table = "B25105",
  state = "MD",
  county = "Baltimore city",
  msa = "Baltimore-Columbia-Towson, MD Metro Area"
)

tbl_acs_data <- bind_rows(
  area_acs_data,
  regional_acs_data
)

tbl_acs_data <- select_acs(tbl_acs_data, .column_title_col = NULL)

Then I can use gt_acs() (a {getACS} function built around the gt() function from the {gt} package) to create a formatted table:

gt_acs(
  data = tbl_acs_data,
  rowname_col = "NAME",
  value_label = "Median monthly housing costs",
  table = "B25105"
) |>
  fmt_currency(c("estimate", "moe"), decimals = 0)
Median monthly housing costs
John Ruhrah EMS $2,977 ± $160
Baltimore city, Maryland $1,184 ± $11
Baltimore-Columbia-Towson, MD Metro Area $1,510 ± $8
Source: 2017-2021 ACS 5-year Estimates, Table B25105.

Plotting demographic data with {ggplot2}

A similar workflow is used to create production-ready plots and graphics using {ggplot2}:

4 Why use R as your GIS?

4.1 What are the advantages of using R?

  • Speed: Working interactively at the command line or writing scripts can be even faster than pointing-and-clicking around a desktop GIS
  • Collaboration:: Engaging with the open-source R community offers opportunities to work users and developers
  • Reproducibility: Writing transparent and documented workflows with parametrized functions that you can use, reuse, and adapt

4.2 Who else is using R in Baltimore?

Within Baltimore City government, R is used routinely by data and research staff, the Baltimore City Department of Finance, the Mayor’s Office of Recovery Programs, Mayor’s Office of Performance and Innovation, and the Baltimore City Public School System.

4.3 Who else is using R for planning?

Urban Institute

Urban Institute

Urban Institute

“The suite of geospatial visualization and analysis tools in R is head and shoulders better than SAS and Stata and often easier and more reproducible than ArcGIS.”

Montgomery Planning

Montgomery Planning

Montgomery Planning

“Most of what we use R for is ‘under the hood’ to access and manipulate data that later gets inserted into plans, reports, and GIS applications.”

5 Resources

5.1 Books on programming with R

  • R for Data Science by Hadley Wickham “will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it and visualize.”
  • Hands on Programming with R by Garrett Grolemund, is “an introduction to R as a programming language and is a great place to start if R is your first programming language.”
  • Advanced R by Hadley Wickham “dives into the details of R the programming language; it’s great place to start if you have existing programming experience”

5.2 Books on spatial data analysis with R

  • Geocomputation with R by Robin Lovelace, Jakub Nowosad, and Jannes Muenchow is a book for people who want to analyze, visualize and model geographic data with open source software.
  • Spatial Data Science With Applications in R by Edzer Pebesma and Roger Bivand introduces fundamental aspects of spatial data that every data scientist should know before they start working with spatial data.

5.3 Additional resources

These GitHub lists include resources on R, Python, and QGIS:

  • R for Planners: A list of R projects and packages created for and by community and transportation planners for research and planning practice.
  • spatial data education: A list of educational resources, tutorials, and course materials related to building, organizing, and analyzing spatial data using R, Python, and other tools.
  • spatial data workshops: A list of workshops and hands-on tutorials on working with spatial data using R, Python, or desktop GIS software.

6 Thank you!

Find the presentation at elipousson.github.io/presentation-tugis2023/ or the code at github.com/elipousson/presentation-tugis2023