Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
Accounting for the study design (between-subject, no clustering), I calculate Minimum Detectable Effect Sizes (MDEs) at 80% power and a 5% significance level for each main outcome:
- Perception Study (Likert outcomes, scale 1–10): Assuming a standard deviation of 2 points (based on pilot data) and a sample of 400–500 respondents rating six headshots each (totaling ~2,400–3,000 ratings), the study is powered at 80% with a 5% significance level to detect a minimum difference of 0.25–0.30 points on a 10-point scale. This would capture moderate shifts in perceived traits like trustworthiness or employability due to oral health appearance.
- Correspondence Study (binary outcome – callback): Assuming a baseline callback rate of 10% and 5,000 applications equally split across control and treatment arms, the study is powered at 80% with a 5% significance level to detect a minimum decrease of 1.2 percentage points. This translates to detecting a reduction in callbacks from 10% to 8.8% or lower for resumes featuring poor oral health appearance.
- Employer Resume Rating Experiment (binary outcome – hiring decision): Assuming a baseline “yes” rate of 50% per resume and 300 to 400 HR respondents, each rating about 12 resumes (totaling 3,600 to 4,800 observations), the study is powered at 80% with a 5% significance level to detect a minimum effect size of approximately 5.5 to 6 percentage points. This corresponds to detecting a drop in hiring likelihood from 50% to 44%–45% for resumes with visible oral health issues.