RE-ALTER: Randomized Experiment with AI-augmented Learning by Teaching to Enhance and Renovate Math Learning

Last registered on August 28, 2024

Pre-Trial

Trial Information

General Information

Title
RE-ALTER: Randomized Experiment with AI-augmented Learning by Teaching to Enhance and Renovate Math Learning
RCT ID
AEARCTR-0014202
Initial registration date
August 18, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
August 28, 2024, 2:32 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
University of Florida

Other Primary Investigator(s)

Additional Trial Information

Status
In development
Start date
2025-01-13
End date
2025-05-15
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
ALTER-Math seeks to enhance middle school math learning by leveraging advanced AI technologies, specifically large language models (LLMs) and AI-enhanced teachable agents. Our aim is to double the rate of middle school math learning, particularly for students from low-income and minority households, by transforming passive learners into proactive teachers of AI. To validate our approach, we will conduct a 3-month two-level cluster randomized trial (RE-ALTER) involving at least 32 teachers and more than 1,600 students. RE-ALTER will leverage Florida's FAST assessment to measure mathematics performance and student log data to gauge engagement. Based on our theoretical groundings with notable effect size for STEM learning and our prior results, we expect the intervention to demonstrate significant improvements on math knowledge in both proximal and distal learning outcomes, particularly for underserved students. Specifically, we seek to answer the research question “How effectively does the AI-powered teachable agent engage students and improve students’ mathematical achievements?”
External Link(s)

Registration Citation

Citation
Xing, Wanli. 2024. "RE-ALTER: Randomized Experiment with AI-augmented Learning by Teaching to Enhance and Renovate Math Learning." AEA RCT Registry. August 28. https://doi.org/10.1257/rct.14202-1.0
Experimental Details

Interventions

Intervention(s)
Intervention Start Date
2025-01-13
Intervention End Date
2025-05-15

Primary Outcomes

Primary Outcomes (end points)
Florida Assessment of Student Thinking (FAST) or similar standardized assessments will serve as a distal measure employed to assess the mathematics performance of middle school students. FAST is the progress monitoring test adopted by the Florida Department of Education (FLDOE). Administered three times throughout the academic year (August, January, and April) the FAST is specifically structured to gauge student growth over time. Aligned with Florida’s BEST standards, this assessment offers a comprehensive evaluation of students' mathematical proficiency. The FAST tests for grades 6 to 8 consist of computer-adaptive tests lasting 100 minutes for waves 1 and 2 and 120 minutes for wave 3, ensuring a thorough examination of students' skills and knowledge across various levels of complexity. FAST contains multiple item formats such as multiple choice, multiselect, edit task choice, matching, and equation editing (Florida Department of Education, 2024). We are in active conversation with Florida school districts to establish a data sharing agreement to access students’ FAST data, and we expect the access before Spring 2025.
Primary Outcomes (explanation)
Students’ Interactive Log data will be used to measure the growth in student mathematics engagement across the spring semester and will be measured using student log data aggregated by week. Our team has previously evaluated students’ interactive log information as a proxy of engagement, such as the total number of videos viewed (Kim et al., 2020). In our study, we will be using the information extracted from the log data, such as students’ time-on-task, retention, frequency of being active, and instructions to AI agents, as a proxy to assess students' engagement.

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
RE-ALTER will be conducted to collect evidence of the effect of student use of ALTER-Math on their math learning and engagement. Leveraging Math Nation, we can selectively deploy updates and randomized experiments to ALTER-Math. The pilot study is a two-level cluster randomized trial (CRT) with students clustered within teachers from Florida middle schools. The random assignment will occur at the teacher level with a total of 32 teachers (16 in each condition), and with an average of 2 classrooms per teacher and 25 students per classroom. We propose to randomize at this level to avoid spillover effects that are likely to occur with random assignment at the student level. The control group is a business-as-usual condition, using Math Nation in the same frequency as the experimental group, but without ALTER-Math support. This design meets the standards of the What Works Clearinghouse (WWC) version 5.0 without reservations (U.S. Department of Education et al., 2022). We will test baseline equivalence using balance tests but do not expect imbalances due to the large, random sample.

Methods: We will start recruiting teachers in early Fall through partnered school districts that are committed to using Math Nation (see Appendix A). In late Fall 2024, all participating teachers will complete a synchronous online PD on ALTER-Math ($250 incentive for each teacher participant). Teachers will complete a knowledge measure after the training and have to answer 80% of the items correctly to proceed in the study. Randomized assignment of the control and experiment groups will begin in early Spring 2025, following the PD session, to officially launch the study. If assigned to the experiment group, participating teachers will commit to using ALTER-Math with their students at least one hour per week in early Spring 2025. Regardless of intervention assignment, each teacher will receive $1,000 as compensations. The fidelity of implementation will be checked through system logs. We will communicate regularly with treatment teachers to ensure a high level of fidelity of implementation. The intervention will last three months, which is approximately 12 weeks. Previous studies of AI-based interventions in middle school have shown that regular weekly use during the Spring semester leads to improved student achievement on a high-stakes test administered at the end of the semester (Leite et al., 2022). All participating teachers will have full access to ALTER-Math for the duration of the LEVI project after the experimental study.

Inclusion of underserved students: In order to guarantee the participation of students from economically disadvantaged and racially minority backgrounds, our focus is on partnering with districts that (1) have a substantial student body to maximize our influence and (2) serve a large proportion of financially challenged (using free lunch or reduced-price meal status as a proxy) and minority (available in Math Nation database) students. To this end, we have identified an initial list of K-12 school districts that have committed to a four-year use of Math Nation. We have access to all the data generated by the users from these school districts in Math Nation, including demographics such as race, gender, and schools, behavioral logs, and in-platform assessments.

Data Analysis Plan: We will assess the effect of ALTER-Math on student interest, proximal and distal achievement scores using a two-level ANCOVA model:
Y_jkt=γ_0+γ_1 Z_k+γ_2 (X_jkt-M_k )+γ_3 (M_k-M)+u_k+ε_ijk
where Yjkt is the standardized posttest score on an outcome measure for student j in teacher k in treatment t. Zk is an effect-coded variable indicating the condition to which teacher k was assigned, Xjkt is the pretest score, Mk is a teacher-level mean pretest score, M is the grand mean pretest score, and uk and εjkt are teacher and student-level residuals, respectively. For the FAST, the post-test score is the third assessment given at the end of the school year (April to May, 2025) and the pre-test score is the second assessment administered in mid-year (January, 2025). We will explore the inclusion of fixed district effects during data analysis. Between-teacher and within-teacher covariates by treatment interactions will also be investigated. Standardized mean difference effect sizes will be reported for each analysis.

To assess the effect of ALTER-Math on weekly student engagement across 12 weeks of the Spring semester, we will use the following growth curve model:
ϴ_ijk=δ_0+δ_1 T_ijk+δ_2 Z_k+r_0jk+r_1jk T_ijk+u_k+ε_ijk
where ϴijk is the engagement estimate at week i from 1 to 12 to for student j in teacher k in treatment Z. 𝛿₀ is the average intercept and r0jk is a random intercept of students, which quantifies the student initial engagement status. 𝛿₁ is the mean growth per week and r1jk is the random slope of week, quantifying the individual linear rate of change of engagement over the Spring semester. uₖ is the school random effect and 𝜀ijk is a residual. The week indicator will be coded from 0 to 11 so that the intercept can be interpreted as the initial level of engagement. The parameters of interest in this model are the mean of the individual intercept (i.e, the average initial status); the variance of the intercept; the mean of the slope of week (i.e., the average change in engagement per week); the variance of the slope; and the correlation between intercept and slope.
Experimental Design Details
Not available
Randomization Method
Simply randomization (coin flip)
Randomization Unit
The study is a two-level cluster randomized trial (CRT) with students clustered within teachers from Florida middle schools.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
~32 teachers
Sample size: planned number of observations
~1,600 students
Sample size (or number of clusters) by treatment arms
16 teachers control, 16 teachers treatment
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
The minimum detectable effect size (MDES) with a sample of 32 teachers and a power level of 0.8 at a = 0.05 was estimated with the PowerUp software (Dong & Maynard, 2013). The design is a two-level cluster randomized trial with students within teachers, with treatment at level two. We estimated that each teacher has an average of 2 classes of 25 students each. An estimate of the proportion of variance among teacher-level performance (ICC = 0.2) was assumed. The estimate of proportion of variance explained (R2 = 0.80) at each level by the pre-test (i.e., performance on the mathematics test of the previous year) was estimated using high-stakes mathematics achievement data from a large district in Florida (Leite et al., 2022). Estimates of effect size (0.25) were obtained from our previous 5-week quasi-experimental study. In our study, a significant increase in mathematical knowledge was observed among the treatment group, t(826) = 7.80, p < .001, d = 0.315, compared to those in the control group, t(773) = 5.28, p < .001, d = 0.202). We calculated the MDES with a power of 0.8 as a function of the total number of teachers, indicating that 32 teachers allow the detection of an effect size of 0.27 assuming a 90% retention rate for both teachers and students, a slightly conservative effect size based on our prior quasi-experimental study.
IRB

Institutional Review Boards (IRBs)

IRB Name
ALTER-Math: AI-augmented Learning by Teaching to Enhance and Renovate Math Learning
IRB Approval Date
2024-08-07
IRB Approval Number
IRB202301838