Center for Studies in Demography and Ecology

Reproducible GIS analysis with R

Instructor: Phil Hurvitz


Objective
This course will introduce the use of R for script-based geographic information systems (GIS) overlay analysis workflows. By the end of the course you will be able to script simple GIS analyses and present results in interactive, self-documented web pages.

Rationale
Whereas desktop GIS packages are great applications for exploratory spatial data analysis and generation of map graphics, with typical work flows they are limited for scientific research. If you can’t remember, reproduce, or clearly communicate your methodology, then what you are doing cannot be technically called “research.” Although both ArcGIS Desktop and QGIS, two leading desktop GIS applications, employ the use of Python for scripting geoprocessing analyses and user interfaces, learning Python can be difficult and requires a strong commitment and a large amount of time to master.

R is an option to ArcPy or the QGIS Python API, being highly used, extensible, free, and script-based. If you have been taking courses in social science, epidemiology, or statistics at UW, you probably already know enough R to get started using it as a GIS analysis platform. R is also becoming more commonly used for various programming tasks beyond its initial development as a statistical programming environment, particularly due to the proliferation of packages that can be added to extend R functionality. One of those packages, sf (Simple Features for R), extends the R analytic framework to include a plethora of spatial data management and analysis tasks and takes advantage of the tibble data frame to include representation of geometric objects. Additionally, the leaflet package can be used for rendering “slippy” (real-time scale-able, pan-able) maps embedded in web pages created by the RMarkdown package. Combining these functionalities with R’s relatively easy programming language can help you perform GIS analyses, generate data summaries from geoprocessing operations, render high-quality tables and graphics, and include maps in your reports.

Prerequisites

The following prerequisites will be required for you to be able to follow along with the class.

  • Beginning to intermediate skills with desktop GIS (e.g., ArcGIS Desktop or QGIS) — with basic knowledge of overlay operations (e.g., intersect, identity)
  • Intermediate skills using vectors, variables, and data frames in R, for example:
    • Defining variables (creating, setting values)
    • Subsetting data frames
    • Creating and calculating new columns in data frames
    • Summarizing one column based on group identifiers in another column

Before class

Please make sure to get your Census API key and RStudio Cloud account before registering for this course.

  • A US Census API key (you will need this in order to download data using tidycensus). It may take a few days to get this so please do this well in advance.

Online materials

All materials for the workshop are available at http://staff.washington.edu/phurvitz/r_gis/

Outline

  1. Quick review of GIS overlay analysis with QGIS or ArcGIS Desktop
  2. Reading GIS data into R
  3. Reading geographically-referenced census data into R with tidycensus
  4. Coordinate transformations
  5. Performing geoprocessing operations with sf functions
  6. Writing GIS data from R to GeoPackage and shape file outputs
  7. Generating summary tables and graphs from overlay operations
  8. Generating a leaflet map to present results