Load household energy burden cohort data with automatic fallback:
Try local database
Fall back to local CSV files
Auto-download from OpenEI if neither exists
Auto-import downloaded data to database for future use
Usage
load_cohort_data(
dataset = c("ami", "fpl"),
states = NULL,
counties = NULL,
vintage = "2022",
income_brackets = NULL,
verbose = TRUE,
...
)Arguments
- dataset
Character, either "ami" (Area Median Income) or "fpl" (Federal Poverty Line)
- states
Character vector of state abbreviations to filter by (optional)
- counties
Character vector of county names or FIPS codes to filter by (optional). County names are matched case-insensitively. Requires
statesto be specified.- vintage
Character, data vintage: "2018" or "2022" (default "2022")
- income_brackets
Character vector of income brackets to filter by (optional)
- verbose
Logical, print status messages (default TRUE)
- ...
Additional filter expressions passed to dplyr::filter() for dynamic filtering. Allows filtering by any column in the dataset using tidyverse syntax. Example:
households > 100, total_income > 50000
Value
A tibble with columns:
geoid: Census tract identifier
income_bracket: Income bracket label
households: Number of households
total_income: Total household income ($)
total_electricity_spend: Total electricity spending ($)
total_gas_spend: Total gas spending ($)
total_other_spend: Total other fuel spending ($)
TEN: Housing tenure category (1=Owned free/clear, 2=Owned with mortgage, 3=Rented, 4=Occupied without rent). Enables analysis of energy burden differences between renters and owners.
TEN-YBL6: Housing tenure crossed with year structure built (6 categories). Allows analysis of how building age and ownership status interact to affect energy burden (e.g., older rental units vs newer owner-occupied homes).
TEN-BLD: Housing tenure crossed with building type (e.g., single-family, multi-unit). Enables analysis of energy burden across different housing structures and ownership patterns.
TEN-HFL: Housing tenure crossed with primary heating fuel type (e.g., gas, electric, oil). Critical for analyzing how heating fuel choice and tenure status jointly influence energy costs and burden.
Examples
if (FALSE) { # \dontrun{
# Single state (fast, good for learning)
nc_ami <- load_cohort_data(dataset = "ami", states = "NC")
# Multiple states (regional analysis)
southeast <- load_cohort_data(dataset = "fpl", states = c("NC", "SC", "GA", "FL"))
# Nationwide (all 51 states - no filter)
us_data <- load_cohort_data(dataset = "ami", vintage = "2022")
# Load specific vintage
nc_2018 <- load_cohort_data(dataset = "ami", states = "NC", vintage = "2018")
# Filter to specific income brackets
low_income <- load_cohort_data(
dataset = "ami",
states = "NC",
income_brackets = c("0-30% AMI", "30-50% AMI")
)
# Filter to specific counties within a state
triangle <- load_cohort_data(
dataset = "fpl",
states = "NC",
counties = c("Orange", "Durham", "Wake")
)
# Or use county FIPS codes
orange <- load_cohort_data(
dataset = "fpl",
states = "NC",
counties = "37135"
)
# Use dynamic filtering for custom criteria
high_burden <- load_cohort_data(
dataset = "ami",
states = "NC",
households > 100,
total_electricity_spend / total_income > 0.06
)
# Analyze energy burden by housing characteristics
# Compare renters vs owners by heating fuel type
nc_housing <- load_cohort_data(dataset = "ami", states = "NC")
library(dplyr)
# Group by tenure and heating fuel to analyze energy burden patterns
housing_analysis <- nc_housing %>%
filter(!is.na(TEN), !is.na(`TEN-HFL`)) %>%
group_by(TEN, `TEN-HFL`) %>%
summarise(
total_households = sum(households),
avg_energy_burden = weighted.mean(
(total_electricity_spend + total_gas_spend + total_other_spend) / total_income,
w = households,
na.rm = TRUE
),
.groups = "drop"
)
} # }