The impact of ChatGPT on policy debates: a randomized control trial

Last registered on March 21, 2023

View Trial History

Pre-Trial

Trial Information

General Information

Title

The impact of ChatGPT on policy debates: a randomized control trial

RCT ID

AEARCTR-0011113

Initial registration date

March 19, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

March 21, 2023, 4:44 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

Spain

Region

Barcelona

Primary Investigator

Name

Antonio Roldan Mones

Affiliation

ESADE and London School of Economics

Contact Primary Investigator

Other Primary Investigator(s)

Additional Trial Information

Status

On going

Start date

2023-03-12

End date

2023-09-30

Keywords

Other

Additional Keywords

democracy, technology, AI, CHatGPT, debates

JEL code(s)

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

This paper studies the impact of ChatGPT on policy debates. I am planning to run a randomized control trial in a real debating competition involving 120 undegraduate students from Esade Business School. About half of the students - the treatment group - will be randomly assigned a short training of ChatGPT and will be allowed to use it as support throughout three rounds of one-to-one debates. The other half of students, the control group, will only be allowed to use conventional internet access to prepare the debates. I will be testing (a) whether the chat improves debating skills of participants; (b) whether it contributes to reduce inequality in debating skills; (c) whether it favors evidence-based policy positions and (d) whether it contributes to changing beliefs of debate participants.

External Link(s)

Registration Citation

Citation

Roldan Mones, Antonio. 2023. "The impact of ChatGPT on policy debates: a randomized control trial." AEA RCT Registry. March 21. https://doi.org/10.1257/rct.11113-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

My intervention consistis in testing the effect of ChatGPT on policy debates. I will do that through a randomized controlled trial at Esade Business School, involving 120 students. The treatment will consist in a 20 minute training in ChatGPT and the possibility of using the Chat to prepare three to four rounds of debates. Students in the control group will only be allowed to use conventional internet access to prepare the debates. I will be testing four main hypotheses: (a) whether ChatGPT improves debating skills of participants in the experiment; (b) whether it contributes to reduce inequality in debating skills; (c) whether it favors evidence-based policy positions and (d) whether it contributes to changing beliefs of debate participants. I will also test whether some of these effects are persistent after some time.

Intervention (Hidden)

Intervention Start Date

2023-03-20

Intervention End Date

2023-03-31

Primary Outcomes

Primary Outcomes (end points)

The main outcomes are:
1) Debating points: individual points of debating participants, a continuous variable from 0 to 100. These points will be assigned by expert judges. All debates will be audio recorded during the debate competition. Overall I expect to have about 240 recorded debates. Judges will be either former debating champions or debating/communication university teachers. Judges will know nothing about the experiment
2) Win/lose debate:
3) Belief update: variation between baseline and endline views on policy.

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

I plan to apply for access to student level administrative data from the university to look at additional outcomes, such as grades in other subjects. The hypothesis being that having been trained to use ChatGPT might affect students' usage of the tool for other subjects and might improve their performance.

Secondary Outcomes (explanation)

Experimental Design

To test the first hypothesis I will use two strategies. First, I will compare average results of people with and without chat. Second, I will compare outcomes within individuals, taking advantage that individuals will only be assigned to the ChatGPT condition after one round of debates. I will be testing the second hypothsis by checking whether ChatGPT has an asymetric impact depending on the initial debating skills (at baseline) of participants. I will use the first round of debates to divide participants into top (50%) and bottom (50%) performers. Then I will compare average debating points in rounds 2 and 3 for top and bottom performers among individuals who use ChatGPT and among individuals who do not use ChatGPT. This will tell me whether the effect of ChatGPT differs depending on baseline performance. All policy debates will be on policy topics with two clearly different sides: a "pro-evidece" side and an "anti-evidence" side. I will look at the interaction of using ChatGPT and being assigned the anti-evidence position. The goal is to test whether the effects of using ChatGPT are stronger when used to defend pro-evidence or anti-evidence arguments. Finally, I will be testing "the persuasive" effect of the Chat, whether it contributes to changing beliefs of debate participants. Do students exposed to defend a specific topic update their beliefs on that topic? In what direction? Is that conditioned by the use of the chat? I will do that by testing whether subjects move closer, after the debate, to the position that they were randomly assigned to defend. I will also study whether the effect is bigger for those using ChatGPT.

Experimental Design Details

Randomization Method

Randomization done in office by a computer. I will randomize students to Chat and NoChat conditions. I will also do a random match of debating couples for all rounds of debates. I will aslo do a random assignment of debate positions ("for" or "against") within couples. I will use basline information to balance randomization.

Randomization Unit

I will randomize at the individual level. I will use two blocks, for the two different days in whcih the debating competition will take place. Since individuals will do several rounds of debates, I will have up to four observations per individual.

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

120 students

Sample size: planned number of observations

I plan to obtain about 480 observations from four debating rounds of 120 students.

Sample size (or number of clusters) by treatment arms

60 students treatment (use of ChatGPT)
60 students control (no use of ChatGPT)

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

The minimum detectable effect size for my main outcome is 0.374 of a standard deviation

Supporting Documents and Materials

IRB

Institutional Review Boards (IRBs)

IRB Name

Esade Research Ethics Committee

IRB Approval Date

2023-03-13

IRB Approval Number

011/2023

Analysis Plan

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?

Data Collection Complete

Data Publication

Is public data available?

Program Files

Reports, Papers & Other Materials

The impact of ChatGPT on policy debates: a randomized control trial

Pre-Trial

General Information

Locations

Primary Investigator

Other Primary Investigator(s)

Additional Trial Information

Registration Citation

Interventions

Primary Outcomes

Secondary Outcomes

Experimental Design

Experiment Characteristics

Institutional Review Boards (IRBs)

Post-Trial

Study Withdrawal

Intervention

Data Publication

Program Files

Relevant Paper(s)

Reports & Other Materials