Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
We utilize the data from similarly parameterized treatments in Drichoutis et al. (2025) to estimate the inputs to our power calculations. With an induced value of $3.00, the average offer shifted from $2.701 with a support set upper bound of $4.00 to $3.158 with a support set upper bound of $6.00. Each subject participated in both treatments. The pooled standard deviation of the two offer distributions was 0.895. In our power analysis presented below, we conservatively assume a between-treatment correlation of 0.3, compared to the actual observed correlation of 0.501.
We employ the asymptotic relative efficiency (ARE) method in our power calculation which estimates the sample size required under a parametric t-test at a given power level and converts the result to the sample size required by the nonparametric Wilcoxon signed-rank test that we use to test our primary hypothesis (Faul et al., 2009). The sample size required under a t-test is then 51. The ARE of the Wilcoxon test is approximately 0.955, implying a sample size of 53 (Faul et al., 2009). Utilizing the actual observed correlation value of 0.509 yields a sample size of 39.
As additional reference points, we conduct the same analysis using the results from our first experiment (AEARCTR-0016635). With an induced value of $3.00, mean willingness to accept increased from $3.413 in the $6.00 support range treatment to $5.451 with a pooled standard deviation of 2.069 and a between-treatment correlation of 0.42, yielding an effect size of 0.823 and sample size requirement of 15. Comparing the $6 treatment against the $4 treatment, we observe a mean shift of $0.737, a pooled standard deviation of 0.917, and correlation of 0.52, resulting in an effect size of 0.792 and sample size requirement of 16. Reducing the correlation to a hypothetical 0.25 increases the minimum sample size range to 17-22 subjects.
In the interest of conservatism, we adopt the larger threshold of 53 subjects determined for the Phase 1 study. Accordingly, we target a sample size of 75 participants, representing the minimum threshold grossed up for incomplete or otherwise unusable responses.