Advanced Search

The Personalization Trap: Does AI Tutoring Build Learners or Dependence?

Last registered on March 05, 2026

View Trial History

Pre-Trial

Trial Information

General Information

Title

The Personalization Trap: Does AI Tutoring Build Learners or Dependence?

RCT ID

AEARCTR-0017917

Initial registration date

February 27, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

March 05, 2026, 8:41 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Name

Affiliation

World Bank

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Ezequiel Molina

PI Affiliation

World Bank

Contact Investigator

PI Name

PI Affiliation

Middlebury College

Contact Investigator

Additional Trial Information

Status

In development

Start date

2026-03-01

End date

2026-12-15

Keywords

Additional Keywords

AI tutoring, education technology, RCT, career coaching, Peru, artificial intelligence, adaptive learning, metacognition, desirable difficulties, personalization trap

JEL code(s)

I21, I25, J24, O15

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

This study evaluates the impact of an AI-powered math tutoring platform and an AI career coaching chatbot on learning outcomes and educational aspirations among 5th-year secondary students in public schools in Lima, Peru. The AI math tutor covers Peru's national math curriculum and is used during regular math class time in school computer labs, supervised by students' regular math teacher. The AI career coach helps students explore post-secondary options, understand economic returns to education, and develop concrete career plans during existing tutoría (guidance) periods.

The study uses a cluster-randomized design with approximately 100 schools as the unit of randomization and three study arms: (1) AI tutoring with learning-science modifications plus career coaching, (2) AI tutoring in standard configuration plus career coaching, and (3) business-as-usual control. The learning-science modifications embed three moments of student cognitive effort (a retrieval practice opener at the start of each session, a predict-reveal-diagnose sequence on errors, and an error detection task on mastered topics) that return to the student metacognitive operations the algorithm otherwise performs on their behalf.

AI tutoring platforms could produce learning gains that are substantially non-durable because the algorithm silently performs metacognitive operations (retrieval, error diagnosis, mastery evaluation) that students need to learn to perform independently—a phenomenon we term the personalization trap. The three modifications test whether returning these cognitive operations to the student makes learning gains persist after platform removal. Primary outcomes are multiple math assessments during the academic year. The study tests whether AI tutoring produces learning gains within existing public-school constraints and whether learning-science modifications to the platform's default interaction design improve the durability of those gains.

External Link(s)

Registration Citation

Citation

Lopez, Carolina, Ezequiel Molina and Germán Reyes. 2026. "The Personalization Trap: Does AI Tutoring Build Learners or Dependence?." AEA RCT Registry. March 05. https://doi.org/10.1257/rct.17917-1.0

Sponsors & Partners

Partner

Name

Dirección Regional de Educación Lima Metropolitana

Type

government

URL

https://www.gob.pe/regionlima-drelm

Name

Type

private_company

URL

https://www.udocz.com/

Name

Type

private_company

URL

https://www.anthropic.com/

Name

Universidad Peruana de Ciencias Aplicadas

Type

none

URL

https://www.upc.edu.pe/

Experimental Details

Interventions

Intervention(s)

The study has three arms:

- Arm 1 (AI Tutoring + Career Coach): Students use the same AI math tutoring platform in its standard configuration, without the learning-science modifications. Same content library, same adaptive difficulty algorithm, same session duration, same visual interface. Full AI support throughout. Students also use the AI career coach during tutoría periods, identical to Arm 2.

- Arm 2 (AI Tutoring with Learning-Science Modifications + Career Coach): Students use the AI math tutoring platform with three embedded modifications grounded in the science of learning. Weeks 1-4 are identical to Arm 1 (full AI support); modifications activate in Week 5. (a) Retrieval Opener (~2 min): at the start of each session, 3 problems from the previous session’s topics with no hints, no error diagnosis, no worked examples, just “Correct” or “Incorrect.” (b) Predict-Reveal-Diagnose on Errors (~2 min/session): before the AI shows a worked example, the student predicts their error type (3 buttons); after seeing the worked example, the student articulates their specific error in an open text box. (c) Error Detection on Mastered Topics (~10 sec, once/session): after 3 consecutive correct on a topic, the student is shown a worked solution containing one conceptual error and must identify the error step and describe the error. Students also use the AI career coach during tutoría periods.

- Arm 3 (Business-as-Usual Control): Students continue with regular classroom math instruction and regular tutoría activities, with no access to the AI platform or career coach.

Both treatment arms 1 and 2 use the platform during a portion of regular weekly math hours in the school computer lab, supervised by the regular math teacher.

Intervention Start Date

2026-03-23

Intervention End Date

2026-07-31

Primary Outcomes

Primary Outcomes (end points)

1. Test + Transfer Test: Math test administered weeks post-intervention. Items require applying mathematical principles to novel problem types not encountered on the platform.
2. Delayed Test: Math test administered near end of school year (November-December 2026), approximately 4-5 months post-intervention.

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

- During-Intervention Performance: Platform’s internal assessment module, with full AI access. Measures AI-supported performance.
- Mid-Intervention “Lights Off” Assessment: 45-minute test, administered without any AI access. Measures independent performance at intervention midpoint.
- Content Test: Same content domains but free-response format, no AI access.
- Platform behavioral data (from platform logs): Hint requests per session, time reading error explanations, problem re-attempt rates after errors, give-up events, total problems attempted and time on task. For Arm 1 only: retrieval opener accuracy, prediction-diagnosis match rate, and error detection accuracy.
- Metacognitive measures: Calibration accuracy, persistence on hard problems, error self-diagnosis, and study time allocation efficiency.
- Career coaching outcomes: Educational aspirations, career knowledge, career plan quality, and beliefs about earnings by education level.
- Implementation fidelity: Compliance rates, technical failure rates, teacher classroom observation scores, and student satisfaction.

Secondary Outcomes (explanation)

Experimental Design

Experimental Design

Cluster-randomized controlled trial with schools as the unit of randomization, conducted in public secondary schools in Lima, Peru. Approximately 110 schools are randomized into three arms: (1) AI tutoring in standard configuration plus career coaching, (2) AI tutoring with learning-science modifications plus career coaching, and (3) business-as-usual control.

The study enrolls approximately 6,600 5th-year secondary students (ages 16-17). Randomization occurs at the school level to avoid contamination between conditions. Schools are selected based on having a functional computer lab, adequate internet connectivity, and sufficient devices, verified by the implementing partner (DRELM, Dirección Regional de Educación de Lima Metropolitana). Data collection includes pre- and post-intervention math assessments, platform usage analytics, and student and teacher surveys, all conducted during regular school hours. The study runs for one full academic year.

Experimental Design Details

Not available

Randomization Method

Randomization was conducted in office using Stata.

Randomization Unit

School. All students in 5th-year secondary at a given school receive the same treatment condition. Stratified by school size and prior academic performance.

Was the treatment clustered?

Yes

Experiment Characteristics

Sample size: planned number of clusters

110 schools

Sample size: planned number of observations

6600 students

Sample size (or number of clusters) by treatment arms

44 schools Arm 1, 44 schools Arm 2, 22 schools control

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Supporting Documents and Materials

Institutional Review Boards (IRBs)

IRB Name

IRB Approval Date

IRB Approval Number