AI-Assisted Writing for Teaching Reading and English as a Second Language (ESL)

Last registered on January 22, 2026

Pre-Trial

Trial Information

General Information

Title
AI-Assisted Writing for Teaching Reading and English as a Second Language (ESL)
RCT ID
AEARCTR-0017611
Initial registration date
January 18, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
January 22, 2026, 7:05 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Stanford University

Other Primary Investigator(s)

PI Affiliation
Stanford University
PI Affiliation
Stanford University

Additional Trial Information

Status
In development
Start date
2026-01-18
End date
2026-04-04
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
Teaching children reading and English as a second language (ESL) skills---skills that provide lifelong returns---requires access to engaging, contextually relevant materials. For children outside of Western contexts, English-language story options often lack familiar settings, norms, problems, and characters. We aim to learn whether stories written with generative AI assistance can fill that gap by creating context-relevant stories that better engage children in India.
External Link(s)

Registration Citation

Citation
Athey, Susan, Kristine Koutout and Yuyan Wang. 2026. "AI-Assisted Writing for Teaching Reading and English as a Second Language (ESL)." AEA RCT Registry. January 22. https://doi.org/10.1257/rct.17611-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
Our intervention delivers stories written with AI-assistance through an edtech app designed for reading and ESL skills.
Intervention Start Date
2026-01-18
Intervention End Date
2026-03-28

Primary Outcomes

Primary Outcomes (end points)
Stage 1
1. binary outcome for clicking a story in either of two trays on the edtech app in the one week following first log in during the experimental period
2. binary outcome for completing a story in either of two trays on the edtech app in the one week following first log in during the experimental period
3. binary outcome for completing a story, conditional on clicking a story, in either of two trays on the edtech app in the one week following first log in during the experimental period

Stage 2
1. number of stories clicked on the edtech app during a three-week period
2. number of stories completed on the edtech app during a three-week period
3. story engagement index on the edtech app during a three-week period
Primary Outcomes (explanation)
Story engagement index is constructed as follows: a click receives a value of 0.3, a start receives a value of 0.5, and a complete receives a value of 1.0. These values are summed across all stories for a user.

Secondary Outcomes

Secondary Outcomes (end points)
1. number of stories liked
2. probability of clicking an additional story, conditional on completing a story
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
An edtech firm will implement an experiment that randomizes users to see different stories on their app for improving reading and ESL skills in two stages, where stages are defined relative to a user's first engagement during the experimental period. In Stage 1, the intervention impacts the stories that users see in the top two "trays'' in the app, where trays are horizontally scrollable story icons from which users choose stories to read. Users will be randomized into three groups: (1) users in the "AI Stories'' group see stories written with AI-assistance; (2) users in the "Matched Stories'' group see stories selected to match the AI Stories on word count (while maintaining similar quality to average stories on the app); and (3) users in the "Baseline'' group continue to see the standard traditionally sourced stories. In Stage 1, users assigned to see AI Stories will only see the Story 1's from each series of AI Stories, so that we can separately identify the role of AI-assisted writing on a single, standalone story versus a series. Stage 1 will answer research questions on whether users engage more with stories generated using AI assistance relative to traditionally sourced stories.

Treatment assignments are persistent in Stage 2; that is, users remain in the same group across both stages. The nature of the intervention changes in Stage 2 for the AI Stories and the Matched Stories groups; in particular, the intervention impacts the stories that are seen throughout the tabs on the app dedicated to stories. Users in the AI Stories group see AI Stories from the full series. Users in the Matched Stories group see traditionally sourced series on the app. Users in the Baseline group continue to see the standard traditionally sourced stories. This stage will answer research questions about how exposure to AI Stories impacts overall engagement with the app, and allow us to differentiate between the impact of the series produced by the AI Stories product from the impact of series more generally, since the AI Stories and Matched Stories groups both see stories that belong to series. Users may increase engagement if they enjoy a series, as they become invested in the premise and know what to expect. On the other hand, the focus on series will decrease the overall diversity of content relative to the baseline which involves a mix of series and standalone stories.
Experimental Design Details
Not available
Randomization Method
Randomization done using a computer.
Randomization Unit
We separately randomize users who are associated with a school and users who are not. For users with school ids, we randomize users at the school level. Otherwise, we randomize users at the individual level.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
90
Sample size: planned number of observations
8100
Sample size (or number of clusters) by treatment arms
AI Stories - 40% or 3,240 users
Matched Stories - 40% or 3,240 users
Baseline - 20% or 1,620 users
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
These are preliminary calculations based on some data that we have received from our partner; however, this data does not have valid school ID information, in addition to other limitations, so we are unable to fully implement our proposed analysis plan for these power calculations. AI vs Baseline Matched vs Baseline AI vs Matched Panel A: Pr(clicking TFY or PoF) Mean 0.18 0.18 0.18 MDE 0.029 0.029 0.024 Panel B: Pr(completing TFY or PoF) Mean 0.12 0.12 0.12 MDE 0.025 0.025 0.020 AI vs Baseline Matched vs Baseline AI vs Matched Panel A: Log-transformed Number of Clicks Mean 0.64 0.64 0.64 SD 1.13 1.13 1.13 MDE 0.094 0.094 0.077 MDE/SD 0.083 0.083 0.068 Panel B: Log-transformed Number of Completions Mean 0.54 0.54 0.54 SD 1.04 1.04 1.04 MDE 0.086 0.086 0.071 MDE/SD 0.083 0.083 0.068 Panel C: Log-transformed Story Engagement Index Mean 0.60 0.60 0.60 SD 1.09 1.09 1.09 MDE 0.091 0.091 0.074 MDE/SD 0.083 0.083 0.068 Panel C: Pr(completing TFY or PoF | clicking TFY or PoF) Mean 0.64 0.64 0.64 MDE 0.086 0.086 0.070
IRB

Institutional Review Boards (IRBs)

IRB Name
Stanford University
IRB Approval Date
2025-10-16
IRB Approval Number
82812