Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
For our experiment, we conducted a power analysis to determine the appropriate sample size required to detect a statistically significant effect. The analysis was based on several key parameters: an effect size of 0.2, which corresponds to a small to medium effect size according to Cohen's conventions, a significance level (alpha) of 0.05, and a desired statistical power of 0.8. The study design includes 4 different groups (representing the different models) and 5 repeated measurements per respondent (corresponding to the different fields of knowledge). Using the `pwr.anova.test` function in R, the power analysis indicated that a total of 346 participants are required to achieve the desired power for detecting differences between the groups in this repeated measures ANOVA. This sample size ensures that our study is sufficiently powered to detect small to moderate effects, reducing the likelihood of Type II errors.