Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
The statistical analysis of the study is a pairwise mean comparisons between treatment conditions using a repeated-measures framework. Following Cohen’s effect-size approach, the target detectable difference is expressed as Cohen’s d, the standardized mean difference: d = μ1−μ2/σ which is unit-free and comparable across outcomes (Cohen, 1988). We set a two-sided significance level of α = 0.05 and target power 1 - β = 0.80. Because each participant completes repeated auction rounds within each block, the required per-treatment sample size is adjusted for M repeated
observations per subject and within-person correlation ρ (design-effect factor). The per-group sample size for comparing two means is:
n = (2(z1−α/2 + z1−β)ˆ2/d2) · (1 + (M − 1)ρ/M), as in standard repeated measures power calculations (Diggle et al., 2002; Kupper and Hafner,
1989; Lui and Wu, 2005). Intuitively, for fixed M, higher ρ reduces the incremental information contributed by additional repeated measures and therefore increases the required number of participants, whereas larger d reduces required sample size. We compute sensitivity for M=4 (homegrown rounds) and ρ ∈ {0.25, 0.50, 0.75} under Cohen’s conventional benchmarks d ∈ {0.20, 0.50, 0.80}. Based on conservative assumptions, we target 245 participants per treatment arm and oversample by 10% to account for attrition and unusable responses, yielding a recruitment target of 270 participants per treatment arm (total N = 1,080). Primary analyses will use inference procedures that account for within-participant dependence arising from repeated rounds, with standard errors clustered at the individual level, while the sample size calculation above addresses the repeated-measures structure at the participant level.