Back to History

Fields Changed

Registration

Field Before After
Last Published November 06, 2015 12:10 AM November 06, 2015 09:25 AM
Randomization Method The randomization is done in Stata 13. First we create unique stratas across suburbs (676 total) and tariff blocks (6 total). This leads to a total of 3,213 distinct stratas. We generate random numbers to each household and then by strata create a rank of the random numbers. Then we randomise on two subsets: indigent, and non-indigent. By strata, we allocate the indigent sample equally across all six treatments and control. For the non-indigent, we create groups by strata and allow unequal allocation across treatments to account for households that will drop if they are in tariff block 1, have a meter reading greater than 35 days, and have an estimated reading for the first month (for social recognition treatments). The randomization is done using Stata v13. First we create stratas across suburbs (676 total) and tariff blocks (6 total). This leads to a total of 3,213 distinct stratas. We generate random numbers to each household and then by strata create a rank of the random numbers. Then we randomise on two subsets: indigent, and non-indigent. By strata, we allocate the indigent sample equally across all six treatments and control. For the non-indigent, we create groups by strata and allow unequal allocation across treatments to account for households that will drop if they are in tariff block 1, have a meter reading greater than 35 days, and have an estimated reading for the first month (for social recognition treatments).
Power calculation: Minimum Detectable Effect Size for Main Outcomes We used the most recent consumption data from November 2014 to April 2015 from the City of Cape Town’s municipal database to conduct our power calculations. For the power calculations. we chose to use the months for which our study will be conducted in order to allow for seasonality effects as consumption increases in the summer months. We matched the municipal data with the list of contract accounts we received from the City's printers. We removed those consuming 6 kiloliters/month or below, as well as the 95th percentile to control for outliers due to measurement errors. We then calculated mean consumption over the treatment period last year (December-April). We include two power calculations: one where we look at the mean consumption over the treatment period with an unbalanced panel and one where we use the balanced panel. I) With our sample size, we are able to detect a 1.5% change in means per treatment. Assuming our standard deviation is 11.03, our mean is 21.46 kiloliters/month, alpha level is 0.05 and power of .8, the 1.5% detectable difference in means would be able to pick up an effect if the consumption decreases to 21.14 kiloliters/month (a difference of 0.32 kiloliters/month) with a minimum sample size of 18,452 per arm. We have tried various strategies for the power calculations, yet the strategy is not sensitive to changes in the detectable effect size. To be able to detect a change from 21.46 liters/month to 20.81 kiloliters/month (3%), we would need a sample size of 4,614 per arm. To be able to detect a change from 21.46 to 20.03 kiloliters/month (2%), we would need a sample size of 10,380 per arm. If we wanted to test a 1% change from 21.46 to 21.24 kiloliters/month, we would need a sample size of 41,516 per arm. We assume there will be high variability in the effect size across income groups. We will use property values and suburb as covariates in our regression models to decrease the variance. II) Our power calculations are robust when using the balanced sample (those whose consumption we observe in each month) With our sample size, we are able to detect a 1.5% change in means per treatment. Assuming our standard deviation is 9.5, our mean is 21.13 kiloliters/month, alpha level is 0.05 and power of .8, a 1.5% detectable difference in means would be able to pick up an effect if the consumption decreases to 20.82 kiloliters/month (a difference of 0.31 kiloliters/month) with a sample size of 14,127 households per arm. To be able to detect a change from 21.13 kiloliters/month to 20.5 kiloliters/month (3%), we would need a sample size of 3, 533 per arm. To be able to detect a change from 21.13 to 20.7 kiloliters/month (2%), we would need a sample size of 7, 947 per arm. If we wanted to test a 1% change from 21.13 to 20.92 kiloliters/month, we would need a sample size of 31, 783 per arm. We used the most recent consumption data from November 2014 to April 2015 from the City of Cape Town’s municipal database to conduct our power calculations. For the power calculations. we chose to use the months for which our study will be conducted in order to allow for seasonality effects as consumption increases in the summer months. We matched the municipal data with the list of contract accounts we received from the City's printers. We removed those consuming 6 kiloliters/month or below, as well as the 95th percentile to control for outliers due to measurement errors. We then calculated mean consumption over the treatment period last year (December-April). We include two power calculations: one where we look at the mean consumption over the treatment period with an unbalanced panel and one where we use the balanced panel. I) With our sample size, we are able to detect a 1.5% change in means per treatment. Assuming our standard deviation is 11.08, our mean is 21.47 kiloliters/month, alpha level is 0.05 and power of .8, the 1.5% detectable difference in means would be able to pick up an effect if the consumption decreases to 21.15 kiloliters/month (a difference of 0.32 kiloliters/month) with a minimum sample size of 18,579 per arm. We have tried various strategies for the power calculations, yet the strategy is not sensitive to changes in the detectable effect size. We assume there will be high variability in the effect size across income groups. We will use property values and suburb as covariates in our regression models to decrease the variance. II) Our power calculations are robust when using the balanced sample (those whose consumption we observe in each month) With our sample size, we are able to detect a 1.5% change in means per treatment. Assuming our standard deviation is 9.5, our mean is 21.1 kiloliters/month, alpha level is 0.05 and power of .8, a 1.5% detectable difference in means would be able to pick up an effect if the consumption decreases to 20.8 kiloliters/month (a difference of 0.31 kiloliters/month) with a sample size of 14,104 households per arm.
Back to top