Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
We used clustersampsi in Stata to compute for the minimum number of clusters needed and the number of sample per cluster. The mean of the outcome indicator (productivity (kg/m2)) is 1.14, and the standard deviation is 0.52. Using complete randomization, minimum sample size per arm would have been 67 with 80% power and 5% level of significance.
Given the randomization done at cluster level, we have to compute for the intraclass correlation (ICC), which is 0.10. Given that the districts have different number of farmers (therefore different cluster size), we have to further adjust the sample size to detect the minimum effect we are interested in. The cluster sizes range from 3-37, and the coefficient of variation (of cluster sizes) is 0.82. The design effect should be inflated by 2.57, accounting for the ICC and varying cluster sizes.
The study is limited by the number of districts at least 3 small-scale active tilapia pond farmers. On the 4 major-producing regions (Brong Ahafo, Ashanti, Eastern and Volta), there are only 38 districts with at least 3 small-scale active tilapia pond farmers. The total number of farmers in these districts is 384. For 1 treatment (aquaculture training), we have 19 clusters and 192 farmers available for the study.
With this design effect inflator, the sample size of 192 farmers in 19 clusters in each treatment arm can detect a minimum of 18% increase in the productivity (kg per m2). Local partners predict and aims an increase in 20%, so we have enough power to detect the predicted outcome increase.
Last, given that we are doing a pair matching method before randomization, we can further deflate the design effect to capture this increase in efficiency and power.