Skip to contents

pivot_nhgis_data() uses tidyr::pivot_longer() switches NHGIS data from a wide to long format and creates a denominator column based on a preset crosswalk between variables and corresponding denominators. The input data must have variable column labels present.

join_nhgis_percent() uses the denominator values added in pivoting to calculating a percent share value. This feature is supported for many but not all of the most popular NHGIS time series variables. Both function uses a similar set of conventions as the {getACS} package to support ease of code reuse between NHGIS and American Community Survey (ACS) data.

At present, all numeric columns that do not appear to be an identifier are pivoted.

Usage

pivot_nhgis_data(
  data,
  variable_col = "variable",
  value_col = "value",
  column_title_col = "column_title",
  denominator_prefix = "denominator_",
  cols_vary = "slowest",
  denominators = list(persons = "A00AA", families = "A68AA", housing_units = "A41AA",
    occupied_units = "A43AA"),
  call = caller_env()
)

join_nhgis_percent(
  data,
  variable_col = "variable",
  value_col = "value",
  column_title_col = "column_title",
  denominator_prefix = "denominator_",
  perc_prefix = "perc_",
  join_cols = c("GISJOIN", "YEAR"),
  digits = 2
)

Arguments

data

A data frame to pivot.

variable_col

Variable column name

value_col

Value column name

column_title_col

Column title column name (to be created from column labels)

cols_vary

When pivoting cols into longer format, how should the output rows be arranged relative to their original row number?

  • "fastest", the default, keeps individual rows from cols close together in the output. This often produces intuitively ordered output when you have at least one key column from data that is not involved in the pivoting process.

  • "slowest" keeps individual columns from cols close together in the output. This often produces intuitively ordered output when you utilize all of the columns from data in the pivoting process.

denominators

Named list of denominator values.

call

The execution environment of a currently running function, e.g. call = caller_env(). The corresponding function call is retrieved and mentioned in error messages as the source of the error.

You only need to supply call when throwing a condition from a helper function which wouldn't be relevant to mention in the message.

Can also be NULL or a defused function call to respectively not display any call or hard-code a code to display.

For more information about error calls, see Including function calls in error messages.

perc_prefix

Prefix string to use for calculated percent value.