Pivot NHGIS data longer to assign denominator variables and join percent values
Source:R/pivot_ipumsr.R
pivot_nhgis_data.Rdpivot_nhgis_data() uses tidyr::pivot_longer() switches NHGIS data from a
wide to long format and creates a denominator column based on a preset
crosswalk between variables and corresponding denominators. The input data
must have variable column labels present.
join_nhgis_percent() uses the denominator values added in pivoting to
calculating a percent share value. This feature is supported for many but not
all of the most popular NHGIS time series variables. Both function uses a
similar set of conventions as the {getACS} package to support ease of code
reuse between NHGIS and American Community Survey (ACS) data.
At present, all numeric columns that do not appear to be an identifier are pivoted.
Usage
pivot_nhgis_data(
data,
variable_col = "variable",
value_col = "value",
column_title_col = "column_title",
denominator_prefix = "denominator_",
cols_vary = "slowest",
denominators = list(persons = "A00AA", families = "A68AA", housing_units = "A41AA",
occupied_units = "A43AA"),
call = caller_env()
)
join_nhgis_percent(
data,
variable_col = "variable",
value_col = "value",
column_title_col = "column_title",
denominator_prefix = "denominator_",
perc_prefix = "perc_",
join_cols = c("GISJOIN", "YEAR"),
digits = 2
)Arguments
- data
A data frame to pivot.
- variable_col
Variable column name
- value_col
Value column name
- column_title_col
Column title column name (to be created from column labels)
- cols_vary
When pivoting
colsinto longer format, how should the output rows be arranged relative to their original row number?"fastest", the default, keeps individual rows fromcolsclose together in the output. This often produces intuitively ordered output when you have at least one key column fromdatathat is not involved in the pivoting process."slowest"keeps individual columns fromcolsclose together in the output. This often produces intuitively ordered output when you utilize all of the columns fromdatain the pivoting process.
- denominators
Named list of denominator values.
- call
The execution environment of a currently running function, e.g.
call = caller_env(). The corresponding function call is retrieved and mentioned in error messages as the source of the error.You only need to supply
callwhen throwing a condition from a helper function which wouldn't be relevant to mention in the message.Can also be
NULLor a defused function call to respectively not display any call or hard-code a code to display.For more information about error calls, see Including function calls in error messages.
- perc_prefix
Prefix string to use for calculated percent value.