Minimum detectable effect size for main outcomes (accounting for sample
design and clustering)
For each medication analyzed, we aim to detect a medium effect size with a 0.05 significance level and 80% power, focusing on two main outcomes: physician rating scores and comment content similarity.
(1) For rating scores on a 5-point scale, we can detect a minimum effect size of 0.15 (Cohen's f2), equivalent to a 0.3-point difference. When including control variables (e.g., gender, age, employment type), a sample size of 108 reviews (divided evenly between treatment and control) is required. Without control variables, 54 reviews (also divided evenly between treatment and control) are sufficient. The regression model will estimate the influence of expert ratings on general physicians' scores, adjusting for demographic factors.
(2) For comment similarity, we can detect a minimum effect size of 0.25 (Cohen's f), requiring 248 pairwise comparisons per group. This translates to a minimum of 16 comments per group, or 64 comments in total, to achieve sufficient pairwise comparisons. An ANCOVA analysis will compare group differences in similarity, assessing whether exposure to different expert types influences the language used in physician reviews.
- For further details, refer to the Pre-Analysis Plan document.