Complementing Teachers: Can Artificial Intelligence Improve Student Learning by Addressing Learning Variability?

Last registered on January 28, 2026

Pre-Trial

Trial Information

General Information

Title
Complementing Teachers: Can Artificial Intelligence Improve Student Learning by Addressing Learning Variability?
RCT ID
AEARCTR-0017147
Initial registration date
January 26, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
January 28, 2026, 7:48 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
IÉSEG School of management

Other Primary Investigator(s)

PI Affiliation
Vanderbilt University

Additional Trial Information

Status
On going
Start date
2025-08-01
End date
2027-12-31
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
Education systems in developing countries face a persistent learning crisis characterized by low average achievement and substantial disparities in student learning. This paper evaluates a teacher-level, artificial intelligence–enabled intervention designed to support lesson planning and promote within-classroom differentiated instruction. We implement a school-level randomized controlled trial in low-income public secondary schools in Bogotá, Colombia. The intervention provides mathematics teachers with access to an AI-based lesson-planning platform that integrates learner variability and growth-oriented pedagogical frameworks, reducing preparation costs while improving instructional alignment with heterogeneous student needs. We assess impacts on student learning, with particular attention to the heterogeneity of the effect in a context marked by pronounced educational inequality. Using baseline and endline student assessments and complementary teacher surveys, we estimate both intention-to-treat and treatment-on-the-treated effects, and examine treatment effect heterogeneity using pre-specified subgroup analyses and data-driven methods.
External Link(s)

Registration Citation

Citation
Barrera-Osorio, Felipe and Juan Munoz Morales. 2026. "Complementing Teachers: Can Artificial Intelligence Improve Student Learning by Addressing Learning Variability?." AEA RCT Registry. January 28. https://doi.org/10.1257/rct.17147-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
In partnership with the Secretaría de Educación Distrital (SED)—the local education authority in Bogotá, Colombia—, and Mentu—a Colombian startup that provides AI-driven learning technologies—, we designed an intervention intended to enhance teachers’ value-added while addressing the heterogeneous learning needs of students from diverse backgrounds. The intervention targeted teachers working in low-income areas of the city and consisted of two components: (i) access to ShaIA, an AI-based pedagogical support platform designed to address learning variability; and (ii) a structured training program to support teachers in the effective use of the platform.

Access to AI-based pedagogical support: ShaIA is an AI-driven pedagogical ecosystem that provides personalized guidance to teachers. The platform is designed to support mathematics instruction by assisting teachers in planning and implementing lessons aligned with curricular standards and students’ diverse learning needs. Treated teachers were granted access to the platform, where they created a course profile—describing course characteristics to the algorithm—and specified learning objectives in mathematics.

ShaIA generates recommendations by channeling a large language model guided by two core pedagogical principles: learning variability and math mindsets.

1. Learning variability: ShaIA promotes inclusive teaching practices aligned with the prioritized learning objectives of the national curriculum. A central feature of the platform is its integration of the Learner Variability framework (leveraging Digital Promise’s Learner Variability Navigator), which tailors pedagogical recommendations based on models capturing students’ diverse backgrounds, needs, and learning profiles.

Teachers incorporate learning variability into ShaIA through a three-step process. First, they select a learning model from four available options: mathematics, language, adult learning, and 21st-century learners. While ShaIA provides an initial recommendation, teachers may modify the model to better align it with their instructional context and objectives. Second, teachers define a class profile by selecting relevant variability factors that characterize their students (e.g., presence of non–Spanish-speaking students, migrants, students with disabilities, or students affected by trauma, etc.). Based on this profile, ShaIA presents a list of 35 factors identified as relevant for mathematics learning.Teachers then select between 10 and 15 factors they consider most salient for their classroom. Third, teachers select inclusion strategies. For each chosen factor, ShaIA suggests evidence-based pedagogical strategies that enable teachers to address classroom diversity simultaneously.These strategies are automatically incorporated into the generation of lesson plans, activities, and instructional resources, ensuring that the platform’s outputs are aligned with the actual characteristics of the class.

2. Math mindsets. ShaIA's suggested pedagogical strategies are grounded in the Math Mindsets approach, which draws on robust recent evidence on effective mathematics teaching. This approach emphasizes understanding, creativity, and confidence, and is rooted in the concept of a growth mindset applied specifically to mathematics learning.

Building on this framework, ShaIA offers tools for lesson planning involving activity design, project-based learning, and development of formative assessments and grading rubrics. Its recommendations are informed by the selected learner variability factors and structured around five pedagogical tools of the math mindsets framework: mathematical experience, worksheets, number talks, pattern talks, and feedback practice. These tools are designed to strengthen mathematical communication, reasoning, problem solving, and modeling, while also supporting procedural fluency as an outcome of numerical flexibility and algebraic thinking rather than through isolated or mechanical practice.

Overall, ShaIA is designed to complement rather than replace teachers. By automating lesson planning and reducing its preparation time, the platform aims to increase teacher engagement and promote higher-value activities such as instructional reflection and individualized student support. By moving beyond a one-size-fits-all approach, ShaIA seeks to enable teachers, particularly those with limited access to pedagogical resources, to adapt instructional strategies to classrooms with heterogeneous learning profiles.

Training on the use of ShaIA: Effective use of the platform requires that teachers understand its functionality and perceive its pedagogical value. To promote adoption and sustained use, \textit{Mentu} and the SED implemented a structured training program for treated teachers. The program consisted of three school visits, three in-person workshops, and three one-hour webinars. The workshops were intended to familiarize teachers with the platform by modeling teaching strategies using ShaIA. The visits were intended to gather feedback and to review the teachers' work plan. Finally, the webinars were designed to reinforce ShaIA's instructional applications, and support its integration into classroom practice.
Intervention Start Date
2025-08-15
Intervention End Date
2026-12-15

Primary Outcomes

Primary Outcomes (end points)
Student learning and teacher engagement.
Primary Outcomes (explanation)
Outcomes for both treatment and control groups will be constructed using the information gathered during data collection. Two main categories of outcomes are relevant for our study: teacher-level outcomes and student-level outcomes.

Teachers: Teachers represent the first step in the successful adoption of AI in the classroom. We expect the intervention to influence teaching strategies, which may in turn affect student learning and reduce variability in performance. The following outcomes will be collected to quantify potential changes in teaching practices:

1. Time spent preparing class: At baseline and endline, teachers will report the amount of time spent preparing their most recently taught class.
2. Share of class time devoted to different activities: Teachers will report the proportion of class time allocated to lecturing, practical exercises, group work, discussions, use of technology, and assessments.
3. Math Teaching Index:} A principal component index capturing math teaching aptitude. It is constructed from the following items, each measured on a 1–5 scale:
- I feel confident teaching math concepts.
- My students actively participate in math classes.
- I use real-world examples to teach math.
- My math classes encourage problem-solving skills.
- Students collaborate with their peers during math activities.
- Technology enhances student learning in my math lessons.
- I have enough resources to teach mathematics effectively.
- My students can explain their mathematical reasoning clearly.
- It is beneficial for the class when students share different perspectives.
- I teach my students that anyone can become more intelligent at math.
- I provide students with individual feedback.

4. AI Perception Index: A principal component index measuring teachers’ perceptions and use of AI in the classroom. It is based on the following items, each on a 1–5 scale:

- I understand how AI can be used to support math instruction.
- I feel confident using AI to generate instructional material.
- I have received adequate training on how to use AI effectively.
- I use AI-generated content to explain math concepts.
- I evaluate AI-generated materials to ensure they are appropriate for students with differentiated learning.
- AI helps me identify and address gaps in student learning.
- I use AI-driven analytics to track student progress and adjust my teaching.
- I have access to AI-generated lesson plans, activities, or instructional sequences aligned with prioritized learning objectives.
- I modify AI-generated resources to better suit the needs of my students.
- My educational institution provides guidance on choosing and adapting AI-generated resources.
- AI has improved the efficiency and quality of my lesson planning.

Students: Our primary student-level outcome consists of test score measures designed and collected by Im-prove (a local experct company specialized in grade-appropriate tests approed by the local education autorities). The assessment includes 18 multiple-choice questions, each with four response options, and is expected to be completed in 50 minutes.\footnote{Testing was conducted in one-hour sessions, but 10 minutes were used to organize the classroom and give instructions, and deliver the material.} The outcome corresponds to the share of correct answers, standardized relative to the mean and standard deviation of test-takers in each round. Although baseline and endline questions differ, both assessments are aligned with the curriculum and the student’s grade level.

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We evaluate the effects of providing teachers with AI support by offering school-level access to ShaIA. Treatment was assigned among 158 eligible schools in Bogotá, Colombia. Due to budget constraints, the agreement with the SED stipulated that teachers in lower secondary (grades 6th to 9th) in 58 schools would receive access to the platform. However, participation was voluntary and depended on each school principal’s willingness to join the evaluation. Schools originally assigned to treatment could therefore opt out. If they did, they neither received access to ShaIA nor participated in the experiment.

Because of the risk of non-participation, we implemented a block-randomization algorithm in which schools were grouped into blocks and treatment was assigned within each block. This ensured that, even if a school assigned to treatment chose not to participate in a given block, the randomization would remain valid across the other blocks of the experiment. The randomization proceeded in three stages:

1. For each eligible school, we created a covariate index defined as the average of several school-level characteristics.

2. Schools were then ranked by the covariate index and grouped into blocks. More similar schools were assigned to the same block.

3. Treatment was randomized within each block, yielding a treatment group of 70 schools and a control group of 88.


Due to the random nature of the treatment assignment, we can estimate the effect of providing teachers with AI support by comparing outcomes for teachers and students in treated schools with those in control schools. Because treatment was assigned using block randomization, these comparisons must be made within blocks; therefore, we condition on block indicators.

Some mathematics teachers in treated schools may choose not to participate in the intervention. As a result, some units assigned to treatment may remain untreated, implying that our strategy identifies an Intention-to-Treat (ITT) effect. To account for imperfect compliance, we additionally define an indicator variable equal to one if teacher adopts the technology. We then instrument treatment adoption with treatment assignment to estimate the Treatment-on-the-Treated (TOT) effect.

Internal Validity: The internal validity of the experiment relies on treatment being assigned orthogonally to baseline characteristics. To assess this, we will conduct balance tests using school-level administrative data and baseline survey data.

Heterogeneity: Treatment effects of educational interventions are known to be highly heterogeneous. This makes it essential to study effect heterogeneity when evaluating the impact of ShaIA. We approach this in two ways.

First, the context of Bogotá suggests substantial pre-existing gaps in math achievement by gender and migrant status. Prior evidence shows that girls in Colombia score significantly lower in math, a gap largely explained by within-school factors. Likewise, migrant students tend to learn less in non-adapted environments, and lower learning rates for migrant populations have been documented in Colombia as well. These patterns provide theoretical grounds to expect larger treatment effects for these groups, given that ShaIA aims specifically to address within-classroom heterogeneity. Therefore, we include these two heterogeneity dimensions in our pre-analysis plan as pre-specified heterogeneous effects that test our original hypothesis. In addition, given the nature of the treatment, we allow for the possibility that treatment effects may vary across teacher characteristics and levels of treatment exposure. Accordingly, we will examine treatment effect heterogeneity by teacher type—specifically age, baseline AI knowledge, and gender—as well as by treatment exposure (time exposed to ShaIA and timing of the workshops).

Second, we will formally test the null hypothesis that the standard deviation of treatment effects is equal to zero. Rejecting this null would indicate heterogeneous impacts, in which case we will estimate Conditional Average Treatment Effects (CATEs) based on pre-specified characteristics. This data-driven analysis will help identify the subgroups that benefit the most from the intervention. Our hypothesis is that girls and migrant students are likely to experience the largest gains, as the intervention is designed to reduce within-class learning gaps.
Experimental Design Details
Not available
Randomization Method
Because of the risk of non-participation, we implemented a block-randomization algorithm in which schools were grouped into blocks and treatment was assigned within each block. This ensured that, even if a school assigned to treatment chose not to participate in a given block, the randomization would remain valid across the other blocks of the experiment. The randomization proceeded in three stages:

1. For each eligible school, we created a covariate index defined as the average of several school-level characteristics.\footnote{All covariates were standardized and then averaged to build a one-dimensional index.

2. Schools were then ranked by the covariate index and grouped into 52 blocks. More similar schools were assigned to the same block.

3. Treatment was randomized within each block, yielding a treatment group of 70 schools and a control group of 88.
Randomization Unit
Randomization was clustered at the school level, with 70 schools treated and 88 in the control group.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
The intervention includes 158 schools, out of which 55 are to be treated and the rest will be part of the control group.
Sample size: planned number of observations
Current funding allow us to sample 6,000 students and 270 teachers within 40 treated and 50 control schools. However, the final sample size depends is more funding is available in which case we will use it to sample a larger number of students.
Sample size (or number of clusters) by treatment arms
55 schools will be assigned to the treatment group. The remaining 103 schools will be assigned to the control group.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
Innovations for Poverty Action
IRB Approval Date
2025-03-06
IRB Approval Number
4889