The impact of ChatGPT on policy debates: a randomized control trial

Last registered on March 21, 2023

Pre-Trial

Trial Information

General Information

Title
The impact of ChatGPT on policy debates: a randomized control trial
RCT ID
AEARCTR-0011113
Initial registration date
March 19, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
March 21, 2023, 4:44 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation
ESADE and London School of Economics

Other Primary Investigator(s)

Additional Trial Information

Status
On going
Start date
2023-03-12
End date
2023-09-30
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This paper studies the impact of ChatGPT on policy debates. I am planning to run a randomized control trial in a real debating competition involving 120 undegraduate students from Esade Business School. About half of the students - the treatment group - will be randomly assigned a short training of ChatGPT and will be allowed to use it as support throughout three rounds of one-to-one debates. The other half of students, the control group, will only be allowed to use conventional internet access to prepare the debates. I will be testing (a) whether the chat improves debating skills of participants; (b) whether it contributes to reduce inequality in debating skills; (c) whether it favors evidence-based policy positions and (d) whether it contributes to changing beliefs of debate participants.
External Link(s)

Registration Citation

Citation
Roldan Mones, Antonio. 2023. "The impact of ChatGPT on policy debates: a randomized control trial." AEA RCT Registry. March 21. https://doi.org/10.1257/rct.11113-1.0
Experimental Details

Interventions

Intervention(s)
My intervention consistis in testing the effect of ChatGPT on policy debates. I will do that through a randomized controlled trial at Esade Business School, involving 120 students. The treatment will consist in a 20 minute training in ChatGPT and the possibility of using the Chat to prepare three to four rounds of debates. Students in the control group will only be allowed to use conventional internet access to prepare the debates. I will be testing four main hypotheses: (a) whether ChatGPT improves debating skills of participants in the experiment; (b) whether it contributes to reduce inequality in debating skills; (c) whether it favors evidence-based policy positions and (d) whether it contributes to changing beliefs of debate participants. I will also test whether some of these effects are persistent after some time.


Intervention Start Date
2023-03-20
Intervention End Date
2023-03-31

Primary Outcomes

Primary Outcomes (end points)
The main outcomes are:
1) Debating points: individual points of debating participants, a continuous variable from 0 to 100. These points will be assigned by expert judges. All debates will be audio recorded during the debate competition. Overall I expect to have about 240 recorded debates. Judges will be either former debating champions or debating/communication university teachers. Judges will know nothing about the experiment
2) Win/lose debate:
3) Belief update: variation between baseline and endline views on policy.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
I plan to apply for access to student level administrative data from the university to look at additional outcomes, such as grades in other subjects. The hypothesis being that having been trained to use ChatGPT might affect students' usage of the tool for other subjects and might improve their performance.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
To test the first hypothesis I will use two strategies. First, I will compare average results of people with and without chat. Second, I will compare outcomes within individuals, taking advantage that individuals will only be assigned to the ChatGPT condition after one round of debates. I will be testing the second hypothsis by checking whether ChatGPT has an asymetric impact depending on the initial debating skills (at baseline) of participants. I will use the first round of debates to divide participants into top (50%) and bottom (50%) performers. Then I will compare average debating points in rounds 2 and 3 for top and bottom performers among individuals who use ChatGPT and among individuals who do not use ChatGPT. This will tell me whether the effect of ChatGPT differs depending on baseline performance. All policy debates will be on policy topics with two clearly different sides: a "pro-evidece" side and an "anti-evidence" side. I will look at the interaction of using ChatGPT and being assigned the anti-evidence position. The goal is to test whether the effects of using ChatGPT are stronger when used to defend pro-evidence or anti-evidence arguments. Finally, I will be testing "the persuasive" effect of the Chat, whether it contributes to changing beliefs of debate participants. Do students exposed to defend a specific topic update their beliefs on that topic? In what direction? Is that conditioned by the use of the chat? I will do that by testing whether subjects move closer, after the debate, to the position that they were randomly assigned to defend. I will also study whether the effect is bigger for those using ChatGPT.

Experimental Design Details
Randomization Method
Randomization done in office by a computer. I will randomize students to Chat and NoChat conditions. I will also do a random match of debating couples for all rounds of debates. I will aslo do a random assignment of debate positions ("for" or "against") within couples. I will use basline information to balance randomization.
Randomization Unit
I will randomize at the individual level. I will use two blocks, for the two different days in whcih the debating competition will take place. Since individuals will do several rounds of debates, I will have up to four observations per individual.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
120 students
Sample size: planned number of observations
I plan to obtain about 480 observations from four debating rounds of 120 students.
Sample size (or number of clusters) by treatment arms
60 students treatment (use of ChatGPT)
60 students control (no use of ChatGPT)
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
The minimum detectable effect size for my main outcome is 0.374 of a standard deviation
IRB

Institutional Review Boards (IRBs)

IRB Name
Esade Research Ethics Committee
IRB Approval Date
2023-03-13
IRB Approval Number
011/2023

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials