Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
We use existing apartment listing data from the same online platform in a pre-trial in Houston, TX to identify the sample size requirements for statistical power. The Houston pre-trial data contains 1563 listings. The pre-trial yielded a 17.9% response rate to white names and 16.7% to names associated with African American or LatinX/Hispanic names (non white names). It also yielded a relatively balanced sample with respect to within Zipcode quartiles of toxic concentrations: 25% for properties in the first quartile of toxic concentration, 21% in the second, 21% in the third and 33\% in the quartile with the highest toxic concentration. With respect to proximity to TRI facilities, 45% of the rental properties are located within 1 mile of a toxic plant.
To compute the sample sizes and the minimum detectable effects of the interaction of race and proximity to a toxic plant, we assume a test power of 90% and a .05 significance level. In the sample of data from the Houston pre-trial, we estimate odds ratios of 0.65 (0.48), 0.76 (0.30), 0.70 (0.32) and 0.79 (0.37) for each of the quartiles and 1.27 (0.42) for the interaction with plant proximity. Standard errors are clustered at the Houston zip code level.
We then simulate the effect of increasing the sample size in a conditional logit model with matched inquiries. Simulation results indicate that effect sizes of 0.41, 0.35, 0.65 and 1.12 can be detected with a sample size of 2,400 properties. Simulations for plant proximity suggest an effect size of 1.54 that can be detected with 3017 properties. Figures 5-7 in our supporting materials plot simulation results (odds ratios and p-values) at different sample sizes. Using an alternate approach from Demidenko (2007, 2008), our simulations indicate that we need approximately 2,680 properties to obtain for detectable odds ratios.
Phillips (2016) provides evidence of within-trial impacts when multiple inquiries sent in matched correspondence designs in competitive labor markets. In a sample restricted to responses to the first inquiry and based on a simple logit model, our simulations show that we should be able to detect an effect with odds ratios of 0.52, 1.41, 1.70, 0.98 at 2,337 properties and an odds ratio for the interaction with proximity to toxic plants of 1.43 at 3676 properties. Figures 8-10 plot the results of these simulations. These power calculations are limited by available data from the Houston pre-trial and the incidence of discriminatory behavior that may be particular to the Houston housing market.
References
Demidenko E. (2007). "Sample size determination for logistic regression revisited." Statistics in Medicine 26:3385-3397
Demidenko E. (2008) "Sample size and optimal design for logistic regression with binary interaction." Statistics in Medicine, 27:36-46
Phillips, David C. "Do comparisons of fictional applicants measure discrimination when search externalities are present? evidence from existing experiments." The Economic Journal (2016).