Skip to contents

define_nhgis_ts_extract() is a wrapper for ipumsr::define_extract_nhgis() with defaults that support the creation of tidy data using read_nhgis_data() or pivot_nhgis_data().

Usage

define_nhgis_ts_extract(
  year = NULL,
  tables = NULL,
  geography = c("county", "state"),
  extent = "us",
  output = c("tidy", "wide", "file"),
  shape_year = NULL,
  basis = 2008,
  geometry = FALSE,
  ...,
  time_series_tables = NULL,
  description = NULL,
  shapefiles = NULL,
  data_format = "csv_no_header",
  validate = TRUE,
  api_key = Sys.getenv("IPUMS_API_KEY")
)

Arguments

output

Used to set tst_layout value. c("tidy", "wide", "file") corresponding to "time_by_row_layout", "time_by_column_layout", or "time_by_file_layout".

geometry

If TRUE, include shapefiles in the defined extract. If shapefiles is NULL, the function uses list_nhgis_shapefiles() with shape_year as the year parameter.

...

Arguments passed on to ipumsr::define_extract_nhgis

datasets

List of dataset specifications for any datasets to include in the extract request. Use ds_spec() to create a ds_spec object containing a dataset specification. See examples.

geographic_extents

Vector of geographic extents to use for all of the datasets in the extract definition (for instance, to obtain data within a particular state). Use "*" to select all available extents.

Required when any of the datasets included in the extract definition include geog_levels that require extent selection. See get_metadata_nhgis() to determine if a geographic level requires extent selection. At the time of writing, NHGIS supports extent selection only for blocks and block groups.

breakdown_and_data_type_layout

The desired layout of any datasets that have multiple data types or breakdown values.

  • "single_file" (default) keeps all data types and breakdown values in one file

  • "separate_files" splits each data type or breakdown value into its own file

Required if any datasets included in the extract definition consist of multiple data types (for instance, estimates and margins of error) or have multiple breakdown values specified. See get_metadata_nhgis() to determine whether a requested dataset has multiple data types.

time_series_tables

List of time series table specifications for any time series tables to include in the extract request. Use tst_spec() to create a tst_spec object containing a time series table specification. See examples.

description

Description of the extract.

shapefiles

Names of any shapefiles to include in the extract request.

data_format

The desired format of the extract data file.

  • "csv_no_header" (default) includes only a minimal header in the first row

  • "csv_header" includes a second, more descriptive header row.

  • "fixed_width" provides data in a fixed width format

Note that by default, read_nhgis() removes the additional header row in "csv_header" files.

Required when an extract definition includes any datasets or time_series_tables.

api_key

API key associated with your user account. Defaults to the value of the IPUMS_API_KEY environment variable. See set_ipums_api_key().