|
Field
Sample size (or number of clusters) by treatment arms
|
Before
Randomization A: 4,000 invites to “(Natural) Field Experiment” arm; our goal is to have a total of around 360 acceptances. Similar studies have had acceptance rates between 12-20%. If the acceptance rate is 12% then we expect that approximately 480 will accept. If the acceptance rate is 20% we expect that around 800 will accept. We have set a goal of around 360 acceptances and will not hire workers after we hit that goal due to budgetary constraints.
Randomization B: 4,000 invites to “Framed Field Experiment” arm; our goal is to have a total of around 360 acceptances. We will follow a procedure similar to that for Randomization A to do our best to obtain this goal.
Randomization C, D, E and F: 2000 invites each; our goal is to have a total of around 180 acceptances in each of these four arms. Similar studies have had acceptance rates between 12-20%. If the acceptance rate is 12% then we expect approximately 240 will accept. If instead the acceptance rate is 20% we expect that around 400 will accept. We have set a goal of around 180 acceptances and will not hire workers after we hit that goal due to budgetary constraints.
|
After
Randomization A: 4,000 invites to arm A; our goal is to have a total of around 360 acceptances. Similar studies have had acceptance rates between 12-20%. If the acceptance rate is 12% then we expect that approximately 480 will accept. If the acceptance rate is 20% we expect that around 800 will accept. We have set a goal of around 360 acceptances and will not hire workers after we hit that goal due to budgetary constraints.
Randomization B: 4,000 invites to “Framed Field Experiment” arm; our goal is to have a total of around 360 acceptances. We will follow a procedure similar to that for Randomization A to do our best to obtain this goal.
Randomization C, D, E and F: 2000 invites each; our goal is to have a total of around 180 acceptances in each of these four arms. Similar studies have had acceptance rates between 12-20%. If the acceptance rate is 12% then we expect approximately 240 will accept. If instead the acceptance rate is 20% we expect that around 400 will accept. We have set a goal of around 180 acceptances and will not hire workers after we hit that goal due to budgetary constraints.
|
|
Field
Power calculation: Minimum Detectable Effect Size for Main Outcomes
|
Before
Power for selection into the experiment (All treatment information is put into the invitation letters):
Based on pilot data, we expect that acceptance and participation rates in group A will be about 12%-20%. With a sample size of 4,000 invitations in group (A), the natural field experiment and group (B) the framed field experiment with alpha=0.05, power=0.80, we will be powered to detect a 2.5 percentage point difference in the acceptance rate between these two groups if the acceptance rate is 20% or a 2.1 percentage point difference if the acceptance rate is 12%.
For the groups which test mechanisms (C), (D), (E) and (F), with 2000 invitations in each of these groups, for the comparisons between A vs. E, A vs. F, B vs. C, and B vs. D we will be powered with alpha=0.05, power=0.80 to detect a 3.1 percentage point difference in acceptance rates if the true acceptance rate is 20%, while we will be able to detect a 2.6 percentage point difference if the true acceptance rate is 12%.
Power for applicant level outcomes:
We base our power calculations on the binary callback (0/1) for a candidate. From pilot data we expect the callback rate to be about 10 percentage points different between our advantaged versus disadvantaged candidates when that is female vs. male (the difference will likely be larger for Black vs. White and Crime vs. NoCrime given previous audit meta-studies). Our pilot data also suggests an intra-recruiter correlation coefficient of 0.05. We simulated data for our regression of interest with stratified randomization, callback as the dependent variable and treatment and treatment x advantaged candidate variables, clustered at the strata level and including recruiter fixed effects. Based on 1000 simulations with 360 recruiters in each of treatment arms (A) and (B) we have 98.6% power to detect the 10pp change for main effect. And with 180 each in groups (C)-(F) we have 95% power to detect the 10pp change in callback rates for the mechanism comparison to (A) or (B).
|
After
Power for selection into the experiment (All treatment information is put into the invitation letters):
Based on pilot data, we expect that acceptance and participation rates in group A will be about 12%-20%. With a sample size of 4,000 invitations in group (A) versus group (B) with alpha=0.05, power=0.80, we will be powered to detect a 2.5 percentage point difference in the acceptance rate between these two groups if the acceptance rate is 20% or a 2.1 percentage point difference if the acceptance rate is 12%.
For the groups which test mechanisms (C), (D), (E) and (F), with 2000 invitations in each of these groups, for the comparisons between A vs. E, A vs. F, B vs. C, and B vs. D we will be powered with alpha=0.05, power=0.80 to detect a 3.1 percentage point difference in acceptance rates if the true acceptance rate is 20%, while we will be able to detect a 2.6 percentage point difference if the true acceptance rate is 12%.
Power for applicant level outcomes:
We base our power calculations on the binary callback (0/1) for a candidate. From pilot data we expect the callback rate to be about 10 percentage points different between our advantaged versus disadvantaged candidates when that is female vs. male (the difference will likely be larger for Black vs. White and Crime vs. NoCrime given previous audit meta-studies). Our pilot data also suggests an intra-recruiter correlation coefficient of 0.05. We simulated data for our regression of interest with stratified randomization, callback as the dependent variable and treatment and treatment x advantaged candidate variables, clustered at the strata level and including recruiter fixed effects. Based on 1000 simulations with 360 recruiters in each of treatment arms (A) and (B) we have 98.6% power to detect the 10pp change for main effect. And with 180 each in groups (C)-(F) we have 95% power to detect the 10pp change in callback rates for the mechanism comparison to (A) or (B).
|