Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
The main hypotheses we are planning to test are whether the availability of the message option crowds out donations. We are planning to test this primary hypotheses by non-parametric and parametric tests comparing across the action space arm (either within a single observability cell or aggregating across them). Secondary hypotheses are (i) about the moderating effect of observability and (ii) about the direct effect of observed actions on later generations.
Given our budget, the target sample is approximately 8,000 participants, i.e. 2,000 per treatment arm. Power calculations focus on pairwise “corner” comparisons between two Gen 0 treatment cells (500 vs 500) using two-sided tests at α=0.05. Based on expected behaviour in our setting (from past literature), we assume an unconditional mean donation of £3.50, a donation rate of 70%, and a standard deviation of donations not exceeding £2.00 on the discrete £0–£10 donation grid.
Under these assumptions, we have approximately 80% power to detect a 10% change in mean donations (a drop from £3.50 to £3.15) in any single cell-to-cell comparison. For the extensive margin, with a baseline donation rate of 70%, we have approximately 80% power to detect a 7.8% relative change in donation participation (0.70 vs 0.62). Pooled main-effect comparisons (e.g., Message vs No-message across observability) are substantially higher powered for the same effect sizes.