Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
Based on a small pilot study, we have calculated the required sample size to detect meaningful effects. The pilot showed a 60-percentage point difference in bill payment rates between treatment (95%) and control (35%) groups. Using a two-sided test with α=0.05 and 80% power, we would need approximately 11 participants per group to detect this effect size (Cohen's h=1.34), however, we anticipate smaller effect sizes at this scale due to the reduced intensity of support compared to the pilot. Comparable interventions in utility bill management suggest a potential reduction in disconnection rates from 15% to 7.5%. To detect this effect size, we would need 376 participants per group. Conservatively assuming a 20-percentage point difference in payment rates, we would need 97 participants per group. The pilot demonstrated a 20% reduction in electricity usage and 10% reduction in gas consumption. For a conservative 7% reduction in overall energy consumption, which aligns with findings from behavioral interventions in energy consumption, we would need approximately 283 participants per group to achieve 80% power. We would need approximately 156 participants per group to detect a 15-percentage point difference in arrearages. For a 10% relative reduction in energy burden, consistent with outcomes from other energy assistance interventions, we would need approximately 425 participants per group.
Our planned sample of 500 participants per group exceeds all these requirements, providing sufficient power even with anticipated attrition of up to 15%. This sample size also enables us to detect relatively small effects (5-7 percentage point differences) in secondary outcomes, conduct meaningful subgroup analyses by demographic characteristics to identify heterogeneous treatment effects, and maintain adequate power for stratified analyses examining impacts by household composition, geographic location, and baseline energy burden. We will supplement these a priori power calculations with sensitivity analyses during the study to refine our estimates of minimum detectable effects based on observed variance in outcomes, correlation between baseline and follow-up measures, and actual attrition rates.