Introduction
The emburden package provides tools for analyzing household energy burden using the Net Energy Return (Nh) methodology. This vignette will walk you through the basic workflow for calculating and analyzing energy burden metrics.
Installation
You can install emburden from GitHub:
# install.packages("devtools")
devtools::install_github("ericscheier/emburden")What is Energy Burden?
Energy burden is the ratio of household energy spending to gross income:
Energy Burden (EB) = S / G
Where: - S = Energy spending (electricity, gas, other fuels) - G = Gross household income
A household spending $3,000 on energy with $50,000 income has a 6% energy burden.
Quick Example: Single Household
# Calculate energy burden for a single household
gross_income <- 50000
energy_spending <- 3000
# Method 1: Direct energy burden
eb <- energy_burden_func(gross_income, energy_spending)
print(paste("Energy Burden:", scales::percent(eb)))
#> [1] "Energy Burden: 6%"
# Method 2: Via Net Energy Return (mathematically identical)
nh <- ner_func(gross_income, energy_spending)
neb <- 1 / (nh + 1)
print(paste("Net Energy Burden:", scales::percent(neb)))
#> [1] "Net Energy Burden: 6%"
print(paste("Net Energy Return:", round(nh, 2)))
#> [1] "Net Energy Return: 15.67"For a single household, both methods give the same result: 6% energy burden.
Loading Data
The package automatically downloads data from OpenEI on first use:
# Load census tract data for North Carolina
nc_tracts <- load_census_tract_data(states = "NC")
# Load household cohort data by Area Median Income
nc_ami <- load_cohort_data(dataset = "ami", states = "NC")
# View structure
head(nc_ami)Calculating Metrics from Cohort Data
When working with pre-aggregated cohort data (total income and spending), calculate metrics from the totals:
# Calculate mean income and spending from totals
nc_data <- nc_ami %>%
mutate(
mean_income = total_income / households,
mean_energy_spending = (total_electricity_spend +
coalesce(total_gas_spend, 0) +
coalesce(total_other_spend, 0)) / households
) %>%
filter(!is.na(mean_income), !is.na(mean_energy_spending), households > 0) %>%
mutate(
eb = energy_burden_func(mean_income, mean_energy_spending),
nh = ner_func(mean_income, mean_energy_spending),
neb = neb_func(mean_income, mean_energy_spending)
)Aggregating Energy Burden (Critical!)
Important: Energy burden is a ratio and cannot be aggregated using arithmetic mean!
The WRONG Way
# ❌ WRONG: Direct averaging of energy burden introduces ~1-5% error
eb_wrong <- weighted.mean(nc_data$eb, nc_data$households)The CORRECT Way: Via Net Energy Return
# ✅ CORRECT: Aggregate using Nh, then convert to NEB
nh_mean <- weighted.mean(nc_data$nh, nc_data$households)
neb_correct <- 1 / (1 + nh_mean)
print(paste("Correct NEB:", scales::percent(neb_correct)))Why does this work? The Nh transformation allows us to use simple arithmetic weighted mean instead of harmonic mean, making aggregation both simpler and more intuitive.
Identifying High Energy Burden Households
The 6% energy burden threshold is commonly used to identify energy poverty:
# 6% energy burden corresponds to Nh = 15.67
high_burden_threshold <- 15.67
high_burden_households <- sum(nc_data$households[nc_data$nh < high_burden_threshold])
total_households <- sum(nc_data$households)
high_burden_pct <- (high_burden_households / total_households) * 100
print(paste("Households with >6% energy burden:",
scales::percent(high_burden_pct/100)))Using calculate_weighted_metrics()
For more complex grouped analysis, use the built-in function:
results <- calculate_weighted_metrics(
graph_data = nc_ami,
group_columns = "income_bracket",
metric_name = "ner",
metric_cutoff_level = 15.67, # 6% burden threshold
upper_quantile_view = 0.95,
lower_quantile_view = 0.05
)
# Format for publication
results$formatted_median <- to_percent(results$metric_median)
print(results)Key Takeaways
- For single households: Both EB and NEB give identical results
- For aggregation: Always use the Nh method to avoid errors
- Never: Directly average energy burden values
- Data loading: Automatic from OpenEI (2018 and 2022 vintages available)
- Threshold: 6% energy burden (Nh ≥ 15.67) identifies high burden households
Temporal Comparison
The package provides a dedicated function for comparing energy burden across data vintages (2018 vs 2022):
# Compare by income bracket
comparison <- compare_energy_burden(
dataset = "ami",
states = "NC",
group_by = "income_bracket"
)
# View results
print(comparison)
# The function automatically:
# - Loads both 2018 and 2022 data
# - Normalizes schema differences (4 vs 6 AMI brackets)
# - Performs proper Nh-based aggregation
# - Calculates changes in energy burden
# Grouping options:
# - "income_bracket": Compare by AMI/FPL brackets (default)
# - "state": Compare multiple states
# - "none": Overall state-level comparison
# Example: State-level comparison
state_comparison <- compare_energy_burden(
dataset = "ami",
states = "NC",
group_by = "none"
)
# Access specific metrics
state_comparison$neb_2018 # 2018 energy burden
state_comparison$neb_2022 # 2022 energy burden
state_comparison$neb_change_pp # Change in percentage points
state_comparison$neb_change_pct # Relative change percentageThis is much simpler than manually loading and aggregating both vintages!
Analyzing Energy Burden by Housing Characteristics
The LEAD Tool data includes detailed housing characteristics that enable analysis of how building attributes affect energy burden. Four key housing dimension columns are available:
- TEN: Housing tenure (1=Owned free/clear, 2=Owned with mortgage, 3=Rented, 4=Occupied without rent)
- TEN-YBL6: Tenure crossed with year structure built (6 categories)
- TEN-BLD: Tenure crossed with building type (single-family, multi-unit, etc.)
- TEN-HFL: Tenure crossed with primary heating fuel type (gas, electric, oil, etc.)
These columns preserve granular housing detail through the data aggregation process, allowing you to analyze energy burden patterns across different housing types.
Example: Comparing Renters vs Owners by Heating Fuel
# Load data with housing characteristics
nc_housing <- load_cohort_data(dataset = "ami", states = "NC")
# Analyze energy burden by tenure and heating fuel
housing_analysis <- nc_housing %>%
filter(!is.na(TEN), !is.na(`TEN-HFL`)) %>%
mutate(
mean_income = total_income / households,
mean_energy_spending = (total_electricity_spend +
coalesce(total_gas_spend, 0) +
coalesce(total_other_spend, 0)) / households,
nh = ner_func(mean_income, mean_energy_spending)
) %>%
group_by(TEN, `TEN-HFL`) %>%
summarise(
total_households = sum(households),
nh_mean = weighted.mean(nh, households),
neb = 1 / (1 + nh_mean),
.groups = "drop"
) %>%
arrange(desc(neb))
# View the top 10 tenure-heating fuel combinations with highest burden
head(housing_analysis, 10)Example: Energy Burden by Building Age and Type
# Analyze by building characteristics
building_analysis <- nc_housing %>%
filter(!is.na(`TEN-YBL6`), !is.na(`TEN-BLD`)) %>%
mutate(
mean_income = total_income / households,
mean_energy_spending = (total_electricity_spend +
coalesce(total_gas_spend, 0) +
coalesce(total_other_spend, 0)) / households,
nh = ner_func(mean_income, mean_energy_spending)
) %>%
group_by(`TEN-YBL6`, `TEN-BLD`) %>%
summarise(
total_households = sum(households),
nh_mean = weighted.mean(nh, households),
neb = 1 / (1 + nh_mean),
.groups = "drop"
)
# Identify building age/type combinations with highest burden
high_burden_buildings <- building_analysis %>%
filter(neb > 0.06) %>% # Above 6% burden threshold
arrange(desc(neb))
print(high_burden_buildings)Key Insights from Housing Analysis
Housing characteristic analysis can reveal:
- Tenure effects: How renters vs owners experience different energy burdens
- Heating fuel disparities: Which fuel types create higher burden (often oil/propane)
- Building age impacts: Older buildings typically have higher burden due to poor insulation
- Structure type patterns: Multi-family vs single-family burden differences
- Vulnerable populations: Combinations like “renter + old building + expensive fuel” often show extreme burden
This granular analysis helps target energy efficiency interventions to the housing types and populations that need them most.
Next Steps
- See
vignette("methodology")for mathematical details - See
NEB_QUICKSTART.mdfor quick reference - Run example scripts in
analysis/scripts/directory - Read full documentation:
?energy_burden_func,?ner_func
References
- Paper: “Net energy metrics reveal striking disparities across United States household energy burdens”
- LEAD Tool Data: https://data.openei.org/
- GitHub: https://github.com/ericscheier/emburden