NUMI, Mastery Learning, and Short-Duration Math Practice

Last registered on April 14, 2026

View Trial History

Pre-Trial

Trial Information

General Information

Title

NUMI, Mastery Learning, and Short-Duration Math Practice

RCT ID

AEARCTR-0018315

Initial registration date

April 09, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

April 14, 2026, 9:01 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

United States of America

Region

Chattanooga

Primary Investigator

Name

Philip Oreopoulos

Affiliation

University of Toronto

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Michael Liut

PI Affiliation

University of Toronto

Contact Investigator

PI Name

Alp Sangu

PI Affiliation

University of Pennsylvania

Contact Investigator

Additional Trial Information

Status

Completed

Start date

2026-03-30

End date

2026-04-03

Keywords

Education

Additional Keywords

AI tutoring, mastery learning, math education, middle school, educational technology, classroom experiment, tutoring, student learning, engagement, exit ticket

JEL code(s)

I21, I28, C93, O33

Secondary IDs

none

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

We are studying whether short classroom math practice activities can be improved by combining mastery learning with an AI tutor. Middle school students are randomly assigned to different versions of an online math practice activity during class. Some students receive access to an AI tutor, while others do not. Some students complete the activity under mastery-learning rules, meaning they must watch enough of the lesson video and answer questions correctly before moving on, while others can move through the activity more freely. Students are also randomly assigned which math topic they practice.

About one week later, students take a test that includes questions on both the topic they practiced and a topic they did not practice. This allows us to compare how much students learned from the activity and whether the effects differ depending on AI access and mastery learning. We also study how far students progress through the activity, how they interact with the AI tutor, and whether technical issues such as website speed may affect results.

External Link(s)

Registration Citation

Citation

Liut, Michael, Philip Oreopoulos and Alp Sangu. 2026. "NUMI, Mastery Learning, and Short-Duration Math Practice." AEA RCT Registry. April 14. https://doi.org/10.1257/rct.18315-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

Students participated in an online classroom math practice activity during a regular class period. Students were randomly assigned to receive access to an AI math tutor or not, and to complete the activity either under mastery-learning rules or under a more flexible non-mastery format. Students were also randomly assigned which of two math topics to practice. In the mastery condition, students had to watch at least part of the instructional video and answer questions correctly before moving on. In the non-mastery condition, students could move through the material more freely. A week later, students completed a follow-up test.

Intervention (Hidden)

Students in grades 6, 7, and 8 completed an online math practice activity during a 50-minute class period. Students were randomized along three dimensions: (i) AI tutor access, (ii) mastery-learning rules, and (iii) topic assignment. Each student was assigned one of two topic bundles, each consisting of two sets of “one video, one exercise,” followed by an exit ticket.

In the mastery condition, students had to watch at least one minute of the video and answer three questions in a row correctly before moving to the next set, and again before reaching the exit ticket. In the non-mastery condition, students had to attempt at least one question but could otherwise move through videos, exercises, and the exit ticket more freely.

Students assigned to AI had access to NUMI, an AI math tutor. NUMI provided several forms of support, including “help get me started” hints, structured step-by-step walkthroughs after mistakes, optional explanations of selected solution steps, and limited open-ended interaction. Students not assigned to AI could view solutions after mistakes but did not receive tutoring support.

Approximately one week later, students completed a test with one question corresponding to each exercise type for both the practiced and unpracticed topic bundles. This design allows estimation of the effect of practice by comparing each student’s performance on the topic they practiced to their performance on the topic they did not practice.

Intervention Start Date

2026-03-30

Intervention End Date

2026-04-03

Primary Outcomes

Primary Outcomes (end points)

The primary outcome is student performance on the week-later test, measured as the within-student difference between performance on questions corresponding to the topic assigned for practice and performance on questions corresponding to the topic not assigned for practice.

Primary Outcomes (explanation)

Each student completes a follow-up test containing one question corresponding to exercise 1 and one question corresponding to exercise 2 for the practiced topic, and one question corresponding to exercise 1 and one question corresponding to exercise 2 for the unpracticed topic. The main outcome is constructed as:

(score on practiced-topic questions) minus (score on unpracticed-topic questions).

We will also examine this separately for the question corresponding to exercise 1 and the question corresponding to exercise 2.

Secondary Outcomes

Secondary Outcomes (end points)

Secondary outcomes include:

total follow-up test score
score on the practiced-topic questions
score on the unpracticed-topic questions
exit ticket score
progression through the practice activity, including whether students reached the second set and the exit ticket
mastery-specific progression outcomes, including whether students cleared the first and second mastery gates
AI usage and engagement measures, including whether students used the AI tutor, number of interactions, and time spent interacting with the tutor

Secondary Outcomes (explanation)

The exit ticket score is based on two questions completed during the practice session, one corresponding to the first exercise and one corresponding to the second. Progression outcomes are constructed from platform logs and include whether the student reached or completed major steps in the activity. Mastery-gate outcomes are based on whether a student satisfied the platform’s progression requirements for each set. AI usage outcomes are constructed from interaction logs, including counts and timing of tutor use.

Experimental Design

This study uses randomized assignment of middle school students to different versions of an in-class online math practice activity. Students are randomly assigned to: (i) access to an AI tutor or not, (ii) mastery-learning rules or not, and (iii) one of two math topics to practice. About one week later, students complete a follow-up test covering both the topic they practiced and the topic they did not practice. The main analysis compares, within student, performance on practiced versus unpracticed material across treatment groups.

Experimental Design Details

The design is a 2 x 2 factorial experiment crossing AI tutor access with mastery learning, with an additional random assignment of practice topic. The four main treatment groups are: Mastery + AI, Non-mastery + AI, Mastery + no AI, and Non-mastery + no AI. Topic assignment is randomized independently. The primary estimand is the within-student practice effect, defined as performance on follow-up test questions corresponding to the practiced topic minus performance on follow-up test questions corresponding to the unpracticed topic. Secondary analyses examine progression through the activity, exit ticket performance, mastery-gate completion, AI engagement, and heterogeneity by grade and topic. Exploratory robustness analyses will examine whether website slowness and low-effort test-taking affect results.

Randomization Method

Randomization was done by computer through the online platform at the student level.

Randomization Unit

Individual student. Topic assignment, AI assignment, and mastery assignment were all randomized at the individual student level.

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

Not applicable; randomization is at the individual student level.
If the registry insists on an entry, use: 0 clusters (individual-level randomization) or N/A - individual-level randomization, depending on what it accepts.

Sample size: planned number of observations

Approximately 8000 students in grades 6–8

Sample size (or number of clusters) by treatment arms

50 50 split

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Assuming approximately 8,000 students, individual-level randomization, 80 percent power, and a two-sided 5 percent significance level, the study is powered to detect effects of about 0.06 standard deviations for a main 50/50 treatment comparison on standardized test-score outcomes. For pairwise comparisons across the four main treatment arms, the corresponding minimum detectable effect is about 0.09 standard deviations.

Supporting Documents and Materials

IRB