Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
At the planned N (100 per arm), Monte Carlo TOST power on the OR primary is 80.3% (B = 20,000 simulation replications; 95% Monte Carlo CI [80.0%, 80.9%]), at alpha = 0.05 per tail, equivalence margin plus-or-minus 10 pp, anchored at p_neutral = 0.97 and p_flag = 0.95. The anchor reflects an expectation of high recall under the stimulus design: the cards are clear single-state illustrations shown for 10 seconds, intentionally well above the threshold at which differential reading speed would be the bottleneck. A small calibration pilot (N approximately 30, May 2026, CIDE undergraduate convenience sample) validated this anchor empirically and confirmed instrument functioning (exposure time tight at 10.15-10.43 s; zero bonus-computation mismatches; clean arm x state and arm x bonus-question balance). Pilot data are not included in the registered analysis and the pilot was conducted on a separate population from the main wave.
Power was computed by direct Monte Carlo simulation of the TOST procedure on the OR primary, not by closed-form normal approximations. At the empirical-ceiling anchor (0.97 / 0.95), the power function is discrete-stepped in (k_flag, k_neutral) against the integer thresholds of the Newcombe-CI band edges, making asymptotic normal approximations unreliable. The exact simulation code, anchor sweeps, and per-N power values are in `analysis/power_sweep.R` and `analysis/power_fine.R`.
MDES (minimum detectable effect size) -- reported under both interpretations, since the TOST design admits two:
1. As equivalence margin. Delta = 10 pp. This is the smallest true |p_flag - p_neutral| that the test is set up to exclude. Effects below this threshold are declared equivalent.
2. As conventional power-bounded MDES on the companion NHST. Under a standard two-sided two-proportion test at alpha = 0.05 with 80% power at p_neutral = 0.97, the smallest detectable absolute difference at N = 100/arm is approximately plus-or-minus 13 pp. This is the MDES for the companion NHST, not for the primary inference.