September 7: Recap & Next Steps

recap
Author

Eli Pousson

Published

September 10, 2022

Thanks for your flexibility as we work out a balanced approach to the course this fall. You can see that I’ve added these recap and next steps updates to the course website so they are easy to find alongside the other course materials.

What did we do for Session 2

Schedule changes

We confirmed class schedule changes for October 5 and November 23. I’ll send out calendar invitations for the following sessions:

  • Monday, October 3, 6:30 pm to 7:30 pm (participation is recommended but the session will be recorded for anyone who can’t attend)

  • Monday, November 21, 6:00 pm to 7:30 pm (participation is optional - this session will be formatted as a drop-in discussion on final project proposals)

We also pushed our next few assignment due dates forward by one week:

  • Find spatial data is due on September 14
  • Read spatial data and make a map with R and with QGIS are both due September 28
  • Compare spatial and geometry operations in QGIS and R is due on October 5

I am keeping the other assignment due dates in place for now but I may make other adjustments if needed. The course schedule is updated to note these schedule changes.

Topics and discussion

We reviewed the process of using GitHub Desktop and the basics of using GitHub to submit assignments. I’m still putting together a more detailed “how to” for this process and will add a link to that resource to this recap and the assignment pages when it is published.

We started with a quick review of last week’s introduction to R and RStudio including:

  • Installing packages from CRAN using install.packages() or from GitHub or other sources using pak::pkg_install()
  • Using RStudio project or .Rproj files (see Workflow: projects in Wickham and Grolemund (2022) for more details)
  • Cloning the course website repository to access the example Quarto document or .qmd files and related spatial data

We continued to review the example on the basics of working with simple feature data in R this week including:

  • How data frames include rows and columns with basic data types including numeric (including double and integer values) and character vectors

  • Using logical operators with the filter function (see Filter rows with filter() from Wickham and Grolemund (2022))

  • How sf objects use a geometry column (usually named geom or geometry) to store spatial data (and how to use the sf::st_geometry() function to access this column)

  • How spatial data is organized using an open data standard (officially named simple feature access) around points, lines, and polygons

  • Summarizing data using the summary(), skimr::skim(), and mapview::mapview() functions

I got a little off-topic and explained how using a “nerd” font offers a better display of logical operators. I recommended Fira Code but Hasklig is another popular alternative. See RStudio documentation on Appearance and Themes for more on customizing your Editor Font option.

In the second part of our class, we discussed the case study on the Arnold Arboretum from Loukissas (2019) and reflected on the parallels between Loukissas’ place-based approach to the analysis based on Patricia Hill Collins’ concept of the matrix of domination from D’Ignazio and Klein (2021).

D’Ignazio and Klein also highlighted a few examples of counterdata projects that included:

If you have your own examples of “counterdata”, please feel free to share them in the course Discord!

Next steps

Everyone has set up a GitHub user account which counts as full credit for the first assignment.

Submit the Finding data assignment

Thanks to everyone who already completed the finding data assignment. This assignment did not include any options for bonus points so if you submitted the assignment you should get full credit for completion on the second assignment. I’ll share feedback via GitHub but post resources related to your topics of interest to the Discord so others can also find them.

If you have not already completed the assignment, please use the 01_find-data_template.md file in the assignments folder of your personal course repository. Keep the headings in place but you can edit everything else in the document. Rename the file to 01_find-data.md and commit your changes when you are done with the assignment.

Submit a weekly log entry

Since we extended the deadline for the weekly log, you can write the reflection based on any of the readings in the course so far. You are also welcome to incorporate outside materials into your reflection - the main goal of this assignment is to give you an opportunity to write about what you’re thinking about and learning during the class. Each log should also include a question. If you’re able to submit your log before our class session, we can use your questions to help guide our in-person work.

Make sure to copy the template file 2022-MM-DD_log.md and save a new copy replacing MM with the two-digit month and DD with the two-digit date of the class session.

What to expect for Session 3

Next week, we’re going to continue our introduction to spatial data with R by covering:

  • Converting tabular data into spatial data (covered in this example)
  • Summarizing or aggregating attribute data
  • Basics of visualizing attribute data with maps and plots

We will also spend some time on an introduction to QGIS with a focus on learning equivalent approaches for the activities we’ve covered in R so far.

This topic is covered in Learn QGIS by Andrew Cutts and Anita Graser but you are welcome to use alternate resources instead.

The QGIS User Guide covers similar material and there are a variety of helpful tutorial videos on YouTube including:

  • QGIS Uncovered by Steven Bernard (a design editor with the Financial Times): These videos are a few years old but they are short and lessons 4, 9, and 10 cover the information you need for the QGIS map making assignment in an accessible format.
  • An Introductory QGIS Workshop for Beginners from Michele Tobias (UC Davis DataLab): This recorded workshop is over 3 hours long but you may find it helpful to check out the section on mapping vector data (starting at 1:24:00) and the section on working with the map layout tools (starting at 2:30:00). There is also a corresponding GitHub repository with more resources.

I’m still in the process of adjusting the schedule to make sure we don’t have any more overloaded sessions so I’m holding off on the poll we discussed in class for now. As always, any feedback on your preferred approach to engaging with this material or questions about anything we have covered so far is welcome.

References

D’Ignazio, Catherine, and Lauren Klein. 2021. “Who Collects the Data? A Tale of Three Maps.” MIT Case Studies in Social and Ethical Responsibilities of Computing, no. Winter 2021 (February). https://doi.org/10.21428/2c646de5.fc6a97cc.
Loukissas, Yanni Alexander. 2019. All Data Are Local: Thinking Critically in a Data-Driven Society. https://doi.org/10.7551/mitpress/11543.001.0001.
Wickham, Hadley, and Garrett Grolemund. 2022. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2nd edition (WIP). Sebastopol, CA: O’Reilly Media. https://r4ds.had.co.nz.