The Effects of AI Feedback: Evidence from a Large-Scale Math Experiment

Last registered on December 09, 2025

Pre-Trial

Trial Information

General Information

Title
The Effects of AI Feedback: Evidence from a Large-Scale Math Experiment
RCT ID
AEARCTR-0017399
Initial registration date
December 07, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
December 09, 2025, 8:13 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation
UNIVERSITÀ COMMERCIALE LUIGI BOCCONI

Other Primary Investigator(s)

PI Affiliation
Inter-American Development Bank
PI Affiliation
University of Northwestern
PI Affiliation
GRADE

Additional Trial Information

Status
In development
Start date
2025-12-08
End date
2025-12-14
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
We evaluate how different forms of feedback delivered through a large-scale digital math platform affect the learning outcomes of primary school students in public schools in Peru. The platform “Apprendemos” currently provides weekly math activities consisting of 30 exercises to students across Peru in grades 3–6. In a one-week experiment, students will be individually randomized into one of two feedback conditions when completing the 30-item exercises: (i) status quo feedback indicating only whether an answer is correct or incorrect; (ii) AI-generated feedback, which provides explanations produced using a generative AI chatbot. For all students, items 1–5 will serve as a pre-test and items 26–30 as a post-test, while items 6–25 will be used to deliver the differentiated feedback.
External Link(s)

Registration Citation

Citation
Aulagnon, Raphaelle et al. 2025. "The Effects of AI Feedback: Evidence from a Large-Scale Math Experiment ." AEA RCT Registry. December 09. https://doi.org/10.1257/rct.17399-1.0
Experimental Details

Interventions

Intervention(s)
The intervention modifies the feedback that students receive while completing a 30-exercise math activity on the Apprendemos platform.
Control – Status quo Apprendemos feedback
For all 30 items, students receive the standard platform feedback, which simply indicates whether an answer is correct or incorrect.
Treatment – AI feedback
Items 1–5: standard Apprendemos feedback.
Items 6–25: AI-generated feedback based on the student’s current response and past responses, produced via a generative AI model integrated into the platform.
Items 26–30: standard Apprendemos feedback.
If the student answers correctly on the first or second attempt, they receive the same “correct” message as in the status quo Apprendemos feedback (kept constant across arms).
On the first incorrect attempt, the student receives a hint. On the second incorrect attempt, the student receives an explanation and the correct answer.
Messages combine pedagogical content (e.g., explanations of the relevant math concept or procedure) and motivational content (e.g., encouragement to keep trying).
The intervention is implemented in collaboration with the Inter-American Development Bank (IDB), GRADE, and Apprendemos, with software development provided by the firm Pyxis to enable the AI feedback treatment.
Intervention (Hidden)
Intervention Start Date
2025-12-08
Intervention End Date
2025-12-14

Primary Outcomes

Primary Outcomes (end points)
Academic achievement in the post test. The main outcome is academic achievement measured using the post-test, defined as the number of correct answers in items 26–30. We will construct a standardized outcome by transforming the raw post-test score into a z-score with mean 0 and standard deviation 1 in the control group. All analyses will control for baseline performance, measured as the number of correct answers in the pre-test (items 1–5), also standardized using the distribution in the control group.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The experiment is embedded in the regular operation of the Apprendemos digital math platform for public primary schools in Peru.
The intervention consists of a single 30-exercise math activity delivered during one week.
Students are individually randomized into two groups of Control and AI-Based Feedback.
The sample of users selected to participate in the experiment includes those that:
a. Do not take part in other special interventions in the APPrendemos program
b. Connected at least once in the preceding year
c. Use the app in Online mode (since this is necessary to view the generative AI feedback displayed).
Experimental Design Details
Randomization Method
Randomization was done using Stata on the sample selected (as mentioned above).
a. Sample of 43,003 students meeting the requirements
b. We randomly remove one student to have equally balanced group
c. We divide the sample into two equal groups with no stratification design
Randomization Unit
Student
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
43,002 students
Sample size: planned number of observations
43,002 students
Sample size (or number of clusters) by treatment arms
21,501 control, 21,501 AI

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
ViaLibre
IRB Approval Date
2025-11-04
IRB Approval Number
13225

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials