Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
Sample size was determined based on the least-powered primary outcome in the pilot study. In the pilot, the full-sample difference in this outcome was approximately 0.046 on a 0-1 normalized outcome scale, with a pooled standard deviation of approximately 0.195. Since the theoretical predictions are directional, the main preregistered tests for the primary outcomes will use one-sided comparisons in the predicted direction. With a 5% significance level and 80% power, detecting the pilot effect for the least-powered primary outcome requires approximately 225 participants per treatment arm. We therefore plan to recruit 500 participants in total, corresponding to 250 participants per arm. At this sample size, the minimum detectable effect is approximately 0.043 units, or 4.3 percentage points, on the normalized outcome scale. Participants who drop out before completion or whose data cannot be used because of technical problems affecting the primary outcomes will be replaced. We will also report two-sided p-values and confidence intervals for transparency.