Impact Evaluation of the Darsel Math Personalized Learning Platform in Jordan

Last registered on November 25, 2025

Pre-Trial

Trial Information

General Information

Title
Impact Evaluation of the Darsel Math Personalized Learning Platform in Jordan
RCT ID
AEARCTR-0017307
Initial registration date
November 21, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
November 25, 2025, 8:03 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
University of Minnesota

Other Primary Investigator(s)

PI Affiliation
University of Minnesota
PI Affiliation
Institute for International Economic Studies, Stockholm University
PI Affiliation
Darsel
PI Affiliation
University of Minnesota
PI Affiliation
Stockholm University & Institute for International Economic Studies
PI Affiliation
Department of Economics, Stockholm University

Additional Trial Information

Status
On going
Start date
2025-09-14
End date
2027-08-01
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
Education contributes to economic growth, and it increases individuals’ incomes and overall quality of life. Most developing countries have succeeded in enrolling almost all children in primary school. Yet in these countries student learning in primary school is often far below the levels envisioned for their grade. This challenge is also the case in Jordan; student performance in Jordan’s primary schools remains well below international benchmarks. One strategy to address low student performance in developing countries is the use of “Educational Technology,” often referred to as “EdTech,” which can take many forms. One form that has attracted interest is the use of AI chatbots, which can personalize learning to teach at the right level (TaRL). We will implement a randomized controlled trial to assess whether a personalized math chatbot – Darsel – increases the mathematics and socio-emotional skills of grade 6 students in Jordan.
External Link(s)

Registration Citation

Citation
Bold, Tessa et al. 2025. "Impact Evaluation of the Darsel Math Personalized Learning Platform in Jordan." AEA RCT Registry. November 25. https://doi.org/10.1257/rct.17307-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
Darsel is an education technology nonprofit organization which is registered in California as a Public Benefit Corporation and is recognized by the U.S. Internal Revenue Service (IRS) as a 501(c)(3) tax-exempt organization. Its mission is to increase student learning in low-resource settings. It has developed a personalized math learning platform (chatbot) that allows students to practice and learn math using low-cost, low-bandwidth messaging channels (e.g., WhatsApp, Facebook Messenger). The platform is adaptive and remedial; Darsel’s algorithms identify and work to rectify learning gaps.

Darsel has been implemented in public schools nationally in Jordan for grade 7. Students use Darsel from home, using a household device (e.g., cell phone), with no need to distribute new hardware and no disruption to school activities. Darsel is designed to be accessible even to the most vulnerable. As of 2016, 98% of students lived in a household with a mobile phone (authors’ calculations based on 2016 Jordan Labor Market Panel Survey). Math teachers receive usage reports for their students via WhatsApp and can access a web-based dashboard. The teacher’s primary role is to motivate and encourage student usage of the Darsel platform.

Darsel has developed over 500,000 questions, hints, and explanations that are aligned with Jordan’s national curriculum. The learning experience on Darsel revolves around questions and answers, where students receive curriculum-aligned questions and respond with the final answer. When students answer incorrectly, Darsel responds with hints (after the first attempt) and full explanations. Darsel also uses AI to dynamically select content for each student based on their response patterns, with the objective of ensuring that content is always provided “at the right level,” and in each student’s proximal zone of development.

More specifically, the algorithm leverages expert-defined skill-related metadata to estimate a student’s mastery probabilities for various skills. The personalization occurs continuously, so the student’s learning path evolves with each question. All content is selected from a large database of content that has previously been reviewed and approved by a math expert. The role of large language models is limited to content development and quality assurance. A/B testing is used to optimize the effectiveness of algorithms and content. The chatbot also offers motivational messages and gamification features to make the learning experience more fun and interactive. For example, students unlock weekly levels based on the number of correct questions and also get celebrated for streaks of correct questions.

Not only is Darsel’s technology and content innovative but so is its method of delivery and implementation. The chatbot’s simplicity, relying on popular messaging channels (WhatsApp, Facebook Messenger), enables it to be implemented in low-resource settings with common household devices and minimal teacher training. This makes it effective for students who are ‘hard to reach’ in traditional classroom settings, such as girls and refugees. Darsel does not disrupt school operations, nor is it demanding of teachers. Implementation of Darsel does not require teachers to adjust their lesson plans or change their approach to teaching. It only asks (but does not mandate or enforce) that teachers spend a total of five minutes per week to review the Darsel report and encourage students to use the platform. In Jordan, it has institutionalized gamification (e.g., school and district leaderboards) and incentives (award ceremonies) to maximize student engagement, motivation, and confidence.

Darsel’s collaboration with Jordan’s Ministry of Education has been ongoing since 2021, when the first pilot was conducted in two public schools. Darsel was then expanded gradually to national adoption for grade 7 students in over 2,000 government schools in March 2023. For grade 7 in the 2024-2025 school year, 43% of students (53,953 students) used Darsel at least once with proper registration. Darsel estimates another 26,605 students, 22%, used Darsel without proper registration; for grade 6 we will ensure all students have to register. Among registered students, nearly half used Darsel for five or more weeks and a quarter used Darsel across both semesters.

In this experiment, we will test the business-as-usual model of Darsel (T1) versus additional teacher support for teaching at the right level (T2). The second treatment arm (T2) includes a set of enhanced interventions designed to influence teacher behavior and improve instructional quality. Students’ and teachers’ access to the Darsel platform will be available during the entire academic year (September 2025 to May 2026) for all schools in the seven districts in the sample that are assigned to either T1 or T2. The reports and activities received by teachers will be based on the treatment to which they were randomly assigned.
Intervention Start Date
2025-10-01
Intervention End Date
2026-06-01

Primary Outcomes

Primary Outcomes (end points)
Two-parameter Item Response Theory (IRT) model of mathematics skill test
Primary Outcomes (explanation)
In line with psychometric measures of educational learning, we will transform the assessment responses into a standardized metric of learning using Item Response Theory (IRT), specifically a two-parameter logistic model (Jacob and Rothstein 2016). The underlying data for the IRT will be binary (correct = 1, incorrect or no answer = 0) for the mathematics assessment questions. For IRT, there will be some common (anchor) items in each of baseline, midline, and endline assessments, as shown in Table 3 in Appendix B, as well as an emphasis on pre-requisite skills for baseline, semester one skills for midline, and semester two skills for endline. Items will also vary in difficulty level. We will pilot all items to identify and remove any that perform poorly (e.g., floor or ceiling effects). To place items from each wave on a common scale, we will undertake IRT after endline so that all items are on the same scale.

We will assess the impact of the treatment on the mathematics IRT score as our primary outcome.

Secondary Outcomes

Secondary Outcomes (end points)
Socio-emotional index
Student dropout and grade repetition
Teacher quality index
Secondary Outcomes (explanation)
To test whether Darsel’s motivational messages, gamification, and teaching at the right level lead to improvements in (math) confidence, motivation, and self-efficacy, we will measure students’ socio-emotional skills. Specifically, we will use the brief form math self-efficacy scale and the brief form math anxiety scale, along with two items from the liking of math scale. These socio-emotional skill measures were designed and tested for United States 5th - 8th graders (Sinclair et al. 2025). An additional item on fixed versus growth mindset is included (Yeager et al. 2019). All these items will have a response scale of: (1) strongly disagree, (2) disagree, (3) agree, (4) strongly agree. Anxiety and fixed mindset items will be reverse coded for analyses. Furthermore, four utility-value items (oriented towards careers), designed and tested for grades 7 - 9 are included (Fiorella et al. 2021). These have responses of (1) never, (2) rarely, (3) usually, (4) sometimes, (5) always. When students do not select a response, their outcome will be missing. We will present analyses for each category of outcomes: self-efficacy, anxiety, liking, fixed vs. growth mindset, and utility-value of math, as well as an overall socio-emotional index. The categories with multiple items and the overall socio-emotional index will be constructed via factor analysis.

We will measure grade repetition and dropout rates based on the students’ enrollment status and grade level in the following academic year (2026 - 2027) from the Ministry of Education’s EMIS.

Specific outcomes for teachers are adapted from the TIMSS 2023 teacher questionnaire (International Association for the Evaluation of Educational Achievement (IEA) 2022) and will be:
Instructional quality: Sum of: How often do you do the following in teaching grade 6 mathematics?
(4) Every or almost‑every lesson (3) About half of lessons (2) Some lessons (1) Never
● Relate lesson to students’ daily lives.
● Ask students to explain their answers.
● Communicate lesson goals/objectives.
● Set challenging exercises beyond instruction.
● Encourage classroom discussions.
● Link new content to prior knowledge.
● Ask students to choose their own problem‑solving

Teacher capacity: Sum of: How much do you agree or disagree?
(1) Agree a lot (2) Agree a little (3) Disagree a little (4) Disagree a lot
● I believe there are too many students in the classes.
● I believe that too many students lack prerequisite knowledge/skills
● I believe there is too much material to cover.
● I believe there are too many teaching hours.
● I need more time to prepare for class
● I need more time to assist individual students.
● I feel too much pressure from parents.
● I find difficulty keeping up with curriculum changes.
● I believe there are too many administrative tasks.

Teaching at the right level (TaRL): Number of yes answers to:
When you notice some of your students are falling behind, what have you done (in the last school year)? (select all that apply)
1 = Group the students in the class according to level
2 = Provide individualized and targeted instruction
3 = Provide individualized homework
4 = Review concepts from earlier grades
5 = Assign extra worksheets or homework assignments
6 = Reach out to the parents/guardians
97 = Other (Specify)

We will also create an overall teacher quality index summing all three of these dimensions.

Experimental Design

Experimental Design
We assess the impact of Darsel’s AI chatbot through a RCT. This RCT will evaluate two versions of the program for grade 6. First, a “business as usual” (T1) intervention that reflects the current model of the Darsel platform that has been used for grade 7 in Jordan. Second, we introduce an intervention (T2) adding teacher encouragement and pedagogical advice (to teachers) features to the platform to evaluate their (combined) effectiveness in comparison to the current model.

We implement two stages of randomization. Randomization occurs first at the school level and then at the classroom level (in some schools). We randomized at the school level within each district and school sex. We have seven districts, and schools are either all-female or all-male. There are 21 schools of each sex selected in each district, from which seven are assigned to pure control, seven schools assigned to T1, and seven schools assigned to T2. We randomly generated ten possible randomizations. We then used multinomial logits that estimate treatment group assignment as a function of average class size for grade six, the number of grade six students in the school for each school (both from the 2025 - 2026 EMIS), and a dummy for whether in 2025-2026 there was at least one teacher teaching multiple classrooms (to ensure balance for classroom level randomization, described below). In addition, we included district-sex fixed effects to control for stratification. We then selected the randomization with the lowest overall chi-squared value (the most balanced of our ten randomizations).

In treatment schools (T1 or T2) with more than one class taught by the same math teacher, we conducted an additional classroom-level randomization. Specifically, we randomly chose one class (for one randomly selected math teacher if more than one is teaching two or more math classes) to be excluded from treatment and serve as a control. This allows us to conduct additional analysis relying on classroom-level randomization. We undertook a similar process of ten possible randomizations on the classroom level, estimating a multinomial logit with class size, and selecting the model with the lowest overall chi-squared value.

As a result, control schools contain only untreated classrooms, while treated schools include treated and untreated classrooms if there is at least one math teacher who teaches more than one math class. At the start of the 2025 - 2026 school year, MoE provided information on the number of classes per teacher in the sampled schools to facilitate implementation of this second level of randomization. In the sample, 255 of the 294 schools have multiple classrooms taught by the same teacher.

To estimate the effect of treatment, we will rely on two types of specifications. In the first, which uses a school-level randomization specification, the control group consists exclusively of classrooms in control schools, where neither Darsel in its current form nor the proposed enhanced model have been introduced. The treatment group consists of treated classrooms in the treatment schools. Therefore, this specification exploits the school-level randomization by comparing treated schools to control schools and excluding the (randomly) untreated classrooms in the treatment schools..

Grade 6 teachers in these control schools may know of Darsel and may have been exposed to it through the rollout of Darsel aimed primarily at grade 7 students during the previous years. Despite the current presence of Darsel for higher grade students in this environment, grade 6 students are unlikely to have used or been encouraged to use Darsel through their school. Moreover, the publicly-available version of Darsel contains only grade 7 content, which is unlikely to help grade 6 students.

In the second specification, we exploit the randomization at the classroom level within those treatment schools with math teachers teaching multiple classes in grade 6. This approach allows us to include teacher fixed effects in the estimation to purge any teacher-specific factors from the treatment effect (in practice, we include school fixed effects, which are equivalent). This approach will allow us to isolate the impact of students using Darsel from any changes in teacher behavior that do not differ across treatment and control classes in the same school.
Experimental Design Details
Not available
Randomization Method
Randomization using Stata code (in office by a computer)
Randomization Unit
We implement two stages of randomization. Randomization occurs first at the school level and then at the classroom level (in some schools).
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
294 schools
Sample size: planned number of observations
21,462 grade 6 students
Sample size (or number of clusters) by treatment arms
98 schools for each of the 3 treatment arms (C, T1, T2)
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
For the school level randomization our power is 0.85 to detect a 0.20 standard deviation (SD) effect. Alternatively, our minimum detectable effect size at 0.80 power is 0.19 SDs.
IRB

Institutional Review Boards (IRBs)

IRB Name
University of Minnesota Human Research Protection Program
IRB Approval Date
2025-07-30
IRB Approval Number
STUDY00025886
IRB Name
American University of Cairo
IRB Approval Date
2025-08-22
IRB Approval Number
#2024-2025-289
Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information