Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
Because the statistical power of our tests are determined by the number of employees who visit the portal, their baseline rate of enrollment, as well as the number of plans represented (in the estimation of treatment effects, we intend to cluster standard errors at the plan level and because we are focused on small-to-mid market plans, these plans can be small), our most informed estimates of statistical power come from prior research administered to the same Voya market segment and occurring in roughly the same decision context (“Picking up the Pace”, Mason 2019, unpublished).
This prior research involved a field test of 3 treatment variations on the escalation enrollment decisions of employees. In their baseline specification, the paper estimates treatment effects on the binary enrollment decision with a standard error of 0.01 from a sample of approximately 8700 employees across 2,400 plans (Table 2). The estimates suggest that to identify changes in automatic escalation enrollment of approximately 0.02, using pairwise comparisons of interventions, would require approximately 2,900 employees per condition (assuming the same distribution of plan sizes).
There are three reasons to expect slightly higher statistical power in our current test. First, we introduce sample restrictions to focus on larger plans, which increases our statistical power for a given overall sample. Second, we intend to oversample the baseline condition both within the study (x2) and through the use of a pre-period control (from Oct 2018). Finally, while we are interested in pairwise comparisons, our central tests of the three mechanisms involve tests across groups of interventions. We therefore roughly assume a need for 2,300/condition to identify a treatment effect on escalation enrollment of approximately 0.02. We therefore aspire towards a non-excluded sample of about, across the 10 interventions and the 11 treatment assignments (including the double-sampling of the baseline), of approximately 25,000 employees. Assuming a small fraction of exclusions do to missing data or commercial exclusions, and recognizing the additional power provided by a pre-trial sample period, we aspire towards approximately 27,000 active online enrollments.