Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
Our calculations of statistical power focus on blood haemoglobin (Hb) concentrations and anemia prevalence as primary outcomes of interest. We will also study impacts on a number of other bio-markers and health indicators (in particular ferritin levels, a key indicator of iron stores), socio-economic outcomes and cognitive outcomes for children, but we base power calculations on Hb concentrations and anemia prevalence because such outcomes are particularly demanding in terms of statistical power.
Using data from rural areas surveyed in the 2005-06 round of the NFHS, we find that the intra-village correlations for hemoglobin and anemia range from 0.02 for anemia among men to 0.11 for anemia among children (6-59 month old). Data from the 1998-99 round of the NFHS confirm similar estimates. The survey of randomized evaluations of food fortification with iron reported in Gera et al (2012) finds an average increase in Hb of 0.28 g/dl (95% CI: 0.28-0.56), and an average reduction in the risk of anemia of 41% (95% CI: 0.29-0.52%).
Recent data from the 2012-13 District Level Household Survey (DLHS) show wide variation in the prevalence of anemia across districts in Tamil Nadu. In several districts anemia rates are 50% or higher among women of fertility age, and 60% or higher among children and we use these figures to approximate anemia prevalence at baseline. We thus calculate first the effect size (E), defined as the ratio between the minimum difference between groups that the researcher is interested in being able to estimate precisely (in the numerator) and the standard deviation of the outcome at baseline (in the denominator). If we evaluate the power of estimating a relatively small 15% reduction in anemia, the effect size for anemia among women of fertility age is 0.15 (= 0.50*0.15/sqrt(0.5*0.5)), while it is 0.18 (0.60*0.15/sqrt(0.6*0.4)) for children. We also assume conservatively that the intra-village correlation (R) will be 0.10 and that baseline Hb and anemia will explain only 10% of the variation in the outcomes (R2=0.10).
Given these numbers, we propose to study 220 clusters (each cluster corresponds the catchment area for a FPS), randomly allocated to either T (n=110) or C (n=110). Using the Optimal Design software, we estimate that with R = 0.10 and R2 = 0.10, we will need 40 tests per cluster to generate a 91% probability of detecting an effect size E = 0.15 using a 95% confidence level.
We also did power calculations for different choices of parameters, including 160 control clusters and 60 treatment clusters (with 40 tests per cluster); 150 control clusters and 50 treatment clusters (with 20 tests per cluster); and 120 control clusters and 60 treatment clusters (with 40 tests per cluster). Statistical power remains in the range of 70-90% even when different parameters are varied in the experimental design. Final decisions will be made keeping in mind logistical and budgetary constraints.