The Personalization Trap: Does AI Tutoring Build Learners or Dependence?

Last registered on March 05, 2026

Pre-Trial

Trial Information

General Information

Title
The Personalization Trap: Does AI Tutoring Build Learners or Dependence?
RCT ID
AEARCTR-0017917
Initial registration date
February 27, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
March 05, 2026, 8:41 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
World Bank

Other Primary Investigator(s)

PI Affiliation
World Bank
PI Affiliation
Middlebury College

Additional Trial Information

Status
In development
Start date
2026-03-01
End date
2026-12-15
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This study evaluates the impact of an AI-powered math tutoring platform and an AI career coaching chatbot on learning outcomes and educational aspirations among 5th-year secondary students in public schools in Lima, Peru. The AI math tutor covers Peru's national math curriculum and is used during regular math class time in school computer labs, supervised by students' regular math teacher. The AI career coach helps students explore post-secondary options, understand economic returns to education, and develop concrete career plans during existing tutoría (guidance) periods.

The study uses a cluster-randomized design with approximately 100 schools as the unit of randomization and three study arms: (1) AI tutoring with learning-science modifications plus career coaching, (2) AI tutoring in standard configuration plus career coaching, and (3) business-as-usual control. The learning-science modifications embed three moments of student cognitive effort (a retrieval practice opener at the start of each session, a predict-reveal-diagnose sequence on errors, and an error detection task on mastered topics) that return to the student metacognitive operations the algorithm otherwise performs on their behalf.

AI tutoring platforms could produce learning gains that are substantially non-durable because the algorithm silently performs metacognitive operations (retrieval, error diagnosis, mastery evaluation) that students need to learn to perform independently—a phenomenon we term the personalization trap. The three modifications test whether returning these cognitive operations to the student makes learning gains persist after platform removal. Primary outcomes are multiple math assessments during the academic year. The study tests whether AI tutoring produces learning gains within existing public-school constraints and whether learning-science modifications to the platform's default interaction design improve the durability of those gains.
External Link(s)

Registration Citation

Citation
Lopez, Carolina, Ezequiel Molina and Germán Reyes. 2026. "The Personalization Trap: Does AI Tutoring Build Learners or Dependence?." AEA RCT Registry. March 05. https://doi.org/10.1257/rct.17917-1.0
Experimental Details

Interventions

Intervention(s)
The study has three arms:

- Arm 1 (AI Tutoring + Career Coach): Students use the same AI math tutoring platform in its standard configuration, without the learning-science modifications. Same content library, same adaptive difficulty algorithm, same session duration, same visual interface. Full AI support throughout. Students also use the AI career coach during tutoría periods, identical to Arm 2.

- Arm 2 (AI Tutoring with Learning-Science Modifications + Career Coach): Students use the AI math tutoring platform with three embedded modifications grounded in the science of learning. Weeks 1-4 are identical to Arm 1 (full AI support); modifications activate in Week 5. (a) Retrieval Opener (~2 min): at the start of each session, 3 problems from the previous session’s topics with no hints, no error diagnosis, no worked examples, just “Correct” or “Incorrect.” (b) Predict-Reveal-Diagnose on Errors (~2 min/session): before the AI shows a worked example, the student predicts their error type (3 buttons); after seeing the worked example, the student articulates their specific error in an open text box. (c) Error Detection on Mastered Topics (~10 sec, once/session): after 3 consecutive correct on a topic, the student is shown a worked solution containing one conceptual error and must identify the error step and describe the error. Students also use the AI career coach during tutoría periods.

- Arm 3 (Business-as-Usual Control): Students continue with regular classroom math instruction and regular tutoría activities, with no access to the AI platform or career coach.

Both treatment arms 1 and 2 use the platform during a portion of regular weekly math hours in the school computer lab, supervised by the regular math teacher.
Intervention Start Date
2026-03-23
Intervention End Date
2026-07-31

Primary Outcomes

Primary Outcomes (end points)
1. Test + Transfer Test: Math test administered weeks post-intervention. Items require applying mathematical principles to novel problem types not encountered on the platform.
2. Delayed Test: Math test administered near end of school year (November-December 2026), approximately 4-5 months post-intervention.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
- During-Intervention Performance: Platform’s internal assessment module, with full AI access. Measures AI-supported performance.
- Mid-Intervention “Lights Off” Assessment: 45-minute test, administered without any AI access. Measures independent performance at intervention midpoint.
- Content Test: Same content domains but free-response format, no AI access.
- Platform behavioral data (from platform logs): Hint requests per session, time reading error explanations, problem re-attempt rates after errors, give-up events, total problems attempted and time on task. For Arm 1 only: retrieval opener accuracy, prediction-diagnosis match rate, and error detection accuracy.
- Metacognitive measures: Calibration accuracy, persistence on hard problems, error self-diagnosis, and study time allocation efficiency.
- Career coaching outcomes: Educational aspirations, career knowledge, career plan quality, and beliefs about earnings by education level.
- Implementation fidelity: Compliance rates, technical failure rates, teacher classroom observation scores, and student satisfaction.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
Cluster-randomized controlled trial with schools as the unit of randomization, conducted in public secondary schools in Lima, Peru. Approximately 110 schools are randomized into three arms: (1) AI tutoring in standard configuration plus career coaching, (2) AI tutoring with learning-science modifications plus career coaching, and (3) business-as-usual control.

The study enrolls approximately 6,600 5th-year secondary students (ages 16-17). Randomization occurs at the school level to avoid contamination between conditions. Schools are selected based on having a functional computer lab, adequate internet connectivity, and sufficient devices, verified by the implementing partner (DRELM, Dirección Regional de Educación de Lima Metropolitana). Data collection includes pre- and post-intervention math assessments, platform usage analytics, and student and teacher surveys, all conducted during regular school hours. The study runs for one full academic year.
Experimental Design Details
Not available
Randomization Method
Randomization was conducted in office using Stata.
Randomization Unit
School. All students in 5th-year secondary at a given school receive the same treatment condition. Stratified by school size and prior academic performance.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
110 schools
Sample size: planned number of observations
6600 students
Sample size (or number of clusters) by treatment arms
44 schools Arm 1, 44 schools Arm 2, 22 schools control
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number