Evaluating User Bias in Content Preference: AI-Generated Texts Versus Popular Online Platforms

Last registered on September 17, 2024

Pre-Trial

Trial Information

General Information

Title
Evaluating User Bias in Content Preference: AI-Generated Texts Versus Popular Online Platforms
RCT ID
AEARCTR-0014231
Initial registration date
September 10, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
September 17, 2024, 11:38 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
UiS

Other Primary Investigator(s)

PI Affiliation
HSE University, Perm
PI Affiliation
HSE University, Perm
PI Affiliation
HSE University, Perm
PI Affiliation
HSE University, Perm

Additional Trial Information

Status
In development
Start date
2024-10-01
End date
2025-01-01
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This study explores user preferences between AI-generated explanations and those curated by popular platforms like Quora. We identify popular questions in five scientific fields and use responses from platforms and AI. Through a field experiment, participants were presented with texts from both sources and asked to indicate their preferred explanation. The findings have the potential to reveal trends in user bias. Our analysis provides insights into the evolving role of AI in content generation and its comparison with traditional human-curated knowledge platforms.
External Link(s)

Registration Citation

Citation
Matkin, Nikita et al. 2024. "Evaluating User Bias in Content Preference: AI-Generated Texts Versus Popular Online Platforms." AEA RCT Registry. September 17. https://doi.org/10.1257/rct.14231-1.0
Experimental Details

Interventions

Intervention(s)
Intervention Start Date
2024-10-01
Intervention End Date
2025-01-01

Primary Outcomes

Primary Outcomes (end points)
The primary outcome is the evaluation of the text.
Primary Outcomes (explanation)
Participants are presented with two different texts. After the text, participants have three response possibilities "Prefer text 1", "Prefer text 2", "Do not have a preference".

Secondary Outcomes

Secondary Outcomes (end points)
Demographic information: Gender and age
Education
Secondary Outcomes (explanation)
We want to examine if demographics have a significant influence on the results. Additionally, as we focus on different scientific topics, we want to examine if the educational background of the participants influences the results.

Experimental Design

Experimental Design
We recruit students at a higher education institute and participants from a popular survey platform.

All participants are first presented with an overview regarding the aim of the experiment, information regarding who to contact, and information regarding how the data will be stored. The experiment will gather no data that can reveal individuals.

We first ask demographic questions (age and gender) and the educational background. Participants are then presented with five questions from five different scientific fields (depending on the education they selected). Each question includes two responses. One response is generated through AI. The other is a human generated response from popular platforms. The participants have to state which response they prefer (the response options are binary: Text 1 or Text 2). We do not reveal if a response is AI or human generated.
Experimental Design Details
Not available
Randomization Method
We use a statistical application, R, Stata, or Mathematica. The randomization uses a code that makes it reproducible.
Randomization Unit
We randomize at the survey level. That means that we construct different surveys, with a different structure.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
We plan to include university participants and participants from popular surveys (similar to MTurk).
Sample size: planned number of observations
We plan to have at least 500 participants. This results in 5,000 observations as each participant responds to ten questions.
Sample size (or number of clusters) by treatment arms
2 groups (students and participants from popular surveys), 500 participants (in total) from both groups, each responding to ten questions.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
For our experiment, we conducted a power analysis to determine the appropriate sample size required to detect a statistically significant effect. The analysis was based on several key parameters: an effect size of 0.2, which corresponds to a small to medium effect size according to Cohen's conventions, a significance level (alpha) of 0.05, and a desired statistical power of 0.8. The study design includes 4 different groups (representing the different models) and 5 repeated measurements per respondent (corresponding to the different fields of knowledge). Using the `pwr.anova.test` function in R, the power analysis indicated that a total of 346 participants are required to achieve the desired power for detecting differences between the groups in this repeated measures ANOVA. This sample size ensures that our study is sufficiently powered to detect small to moderate effects, reducing the likelihood of Type II errors.
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number