Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)

U.S.
For the main outcome, test score (out of 25), the control mean is 10.22 with a standard deviation of 5.63.
With 131 clusters of size 3 (the actual average size is 3.41 and clusters vary widely in size), with power of 0.8 and at a significance level of 0.05, the minimum detectable effect size is 1.13.
With 131 clusters of size 4 (the actual average size is 3.41 and clusters vary widely in size), with power of 0.8 and at a significance level of 0.05, the minimum detectable effect size is 0.98.
Shanghai
For the main outcome, test score (out of 25), the control mean is 20.50 with a standard deviation of 2.95.
With 384 clusters of size 2 (the actual average size is 1.71; most clusters are size 1 with four clusters that are much larger), with power of 0.8 and at a significance level of 0.05, the minimum detectable effect size is 1.05.
With 131 clusters of size 1 (the actual average size is 1.71; most clusters are size 1 with four clusters that are much larger), with power of 0.8 and at a significance level of 0.05, the minimum detectable effect size is 1.00