How much do we gain from personalization?

Last registered on June 19, 2026

Pre-Trial

Trial Information

General Information

Title
How much do we gain from personalization?
RCT ID
AEARCTR-0012190
Initial registration date
October 16, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
October 17, 2023, 1:55 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
June 19, 2026, 12:39 PM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
ITAM

Other Primary Investigator(s)

PI Affiliation
Bocconi University
PI Affiliation
IDB
PI Affiliation
Cornell University
PI Affiliation
Cornell University

Additional Trial Information

Status
Completed
Start date
2023-10-09
End date
2023-12-31
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This study asks whether machine learning can improve the targeting and personalization of low-cost digital "nudges" — push-notification messages sent to parents to encourage their children to use an educational app. The setting is Conecta Ideas, a free smartphone-based mathematics platform used by primary-school students across Peru. Sustaining voluntary use of such platforms is a central challenge, and short motivational messages are a cheap, scalable lever. We study two distinct decisions a platform faces: (1) personalization — WHICH message to send to each person, choosing among several alternative messages the one predicted to work best for that individual; and (2) targeting — WHETHER to message a given person at all, given a single common message, concentrating effort on those predicted to respond most.

The study works with a population of roughly 100,000 parents who had used the app at least once in the prior year. Each parent receives push notifications drawn from four behavioral message types — a teacher recommendation (social norms), peer/usage norms, the parent's role in their child's learning (identity), and the future opportunities that learning math creates (present bias). Treated parents receive two notifications per week (Mondays and Thursdays) over three weeks; the control group receives none. The primary outcome is whether the student logs into the platform in the weeks following the messaging; secondary outcomes are time spent on the platform and the number of exercises completed.

The design has two phases. In Phase 1, parents are randomly assigned (at the individual level, stratified by whether the student logged in the previous week) across the candidate messages and a no-message control. Using the Phase 1 outcome data, we estimate both the average effect of each message and how those effects vary across individuals, using two machine-learning estimators — a causal forest and a k-nearest-neighbor matcher — fed a common set of pre-treatment characteristics (grade, location, prior platform use, baseline achievement, and school characteristics). In Phase 2, the same population is re-randomized into four arms of roughly equal size: a uniform "best" arm (everyone receives the single highest-average-effect message from Phase 1); two "personalized" arms (each person receives the message that the causal forest, or the nearest-neighbor model, predicts is best for them); and a "random" arm that serves as a no-personalization benchmark.

Phase 2 is therefore an out-of-sample experimental test of the personalized and targeted assignment rules learned in Phase 1: the rules are fixed using one experiment's data and then evaluated on a fresh draw of the same population. The study is run twice in contrasting engagement environments. The first implementation takes place during the school year (October–December 2023), when baseline weekly login rates are high. A second implementation (added as an addendum; see below) replicates the same two-phase design during the summer vacation (January–February 2024), when schools are out of session and baseline engagement is far lower, so that the value of personalization and targeting can be compared across a high-engagement and a low-engagement context.

Addendum disclosure: the summer implementation was not pre-registered in advance. It was designed and fielded after the school-year Phase 1 results were known and is documented here for completeness and transparency.
External Link(s)

Registration Citation

Citation
AULAGNON, RAPHAELLE et al. 2026. "How much do we gain from personalization?." AEA RCT Registry. June 19. https://doi.org/10.1257/rct.12190-2.1
Experimental Details

Interventions

Intervention(s)
The intervention is a series of motivational push notifications delivered through the Conecta Ideas mobile app to parents of primary-school students, encouraging them to have their child use the app's mathematics exercises. Notifications are the only treatment; the platform's default is to send none.

Message content. We designed four base messages, each built around a single behavioral channel:
- Teacher recommendation (social norms): the child's teacher recommends regular use of the app.
- Peer usage (social norms): many students across the country already use the app.
- Parental support (identity): parents play a central role in their child's learning.
- Future opportunities (present bias): learning math now opens later academic and job opportunities.

Each message is lightly personalized by inserting the child's name (and, for the teacher message, the teacher's name); the catalog of four messages is held fixed across all phases and both implementations.

Delivery. Treated parents receive two notifications per week — on Mondays and Thursdays at 5pm local time — for three weeks. The control group receives no notifications. Messages are sent as in-app push notifications and are short and low-bandwidth, consistent with the platform's design for low-end smartphones.

How the intervention varies across phases. In Phase 1, the four base messages are combined into the experimental arms used to estimate effects (in the school-year experiment, the 16 arms are all ordered pairs of the four base messages — the message sent on Monday and the message sent on Thursday of each week; in the summer experiment, four single-message arms are used). In Phase 2, each parent is assigned a message according to one of the assignment rules under test — a single uniform "best" message for everyone, a message individually selected by a causal-forest model, a message individually selected by a nearest-neighbor model, or a randomly drawn message — and is then sent that message on the same twice-weekly, three-week cadence.

The intervention is intentionally low-cost and high-frequency: the marginal cost of an additional message is effectively zero, which is why the study focuses on which message to send and to whom, rather than on the cost of messaging itself.
Intervention (Hidden)
Intervention Start Date
2023-10-09
Intervention End Date
2023-12-03

Primary Outcomes

Primary Outcomes (end points)
Platform login (extensive margin of use): a binary indicator equal to one if the student logs into the Conecta Ideas platform at least once during the post-intervention observation window (weekly login).
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Intensive margin of platform use: (1) time spent on the platform (minutes) and (2) the number of exercises/modules completed during the post-intervention window.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The study is a two-phase, individual-level randomized controlled trial, run with a population of roughly 100,000 parents of primary-school students who had used the Conecta Ideas mathematics app at least once in the prior year. The same population is carried across both phases so that assignment rules learned in the first phase can be tested out of sample on the second.

PHASE 1 (estimation). Parents are randomly assigned across a set of message arms plus a no-message control. Randomization is at the individual level and stratified by whether the student logged in during the week before the experiment begins. In the school-year experiment, there are 16 message arms, formed as all ordered pairs of the four base messages (the message delivered on Monday and the message delivered on Thursday of each week); in the summer experiment, there are four single-message arms.

Using the Phase 1 outcome data, we estimate the average effect of each message by OLS and estimate how treatment effects vary across individuals using two machine-learning methods — a causal forest and a k-nearest-neighbor matcher — each given a common vector
of pre-treatment characteristics (e.g., grade, urban/rural location, prior platform use, baseline math achievement, and school characteristics).

PHASE 2 (out-of-sample test). The same population is re-randomized, with roughly equal allocation, into four arms:
- Best: every parent receives the single message with the highest average effect in Phase 1 (a uniform, non-personalized policy).
- Causal forest (personalized): each parent receives the message that the causal forest predicts is best for that individual.
- Nearest neighbor (personalized): each parent receives the message that the k-nearest-neighbor model predicts is best for that individual.
- Random: each parent receives a randomly drawn message, serving as the no-personalization benchmark.

Randomization is again at the individual level, stratified by prior-week login. Phase 2 is thus an experimental evaluation of the Phase 1 machine-learning predictions: the personalized and targeted assignment rules are fixed using Phase 1 data and then implemented on a fresh draw of the same population.

The design lets us separate two questions. Personalization asks WHICH message to send to each individual (does tailoring the message beat a single uniform best message?). Targeting asks WHETHER to message a given individual at all, given a common message
(can we identify who responds most and concentrate messaging on them?). The two phases together allow both questions to be answered out of sample rather than only on held-out folds of a single experiment.

CONTEXTS. The design is implemented twice in contrasting engagement environments: once during the school year (high baseline platform use) and once during the summer vacation (low baseline platform use). Comparing the two lets us assess how the value of personalization and targeting depends on the engagement level of the target population.

ADDENDUM DISCLOSURE. The summer implementation was not pre-registered in advance; it was designed and fielded after the school-year Phase 1 results were known and is described here for completeness and transparency.
Experimental Design Details
Randomization Method
Randomization is done in Stata
Randomization Unit
The treatment is randomized at the individual level
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
NA
Sample size: planned number of observations
100,000
Sample size (or number of clusters) by treatment arms
Total sample: ~100,000 parent-student pairs, held fixed across phases.

EXPERIMENT 1 — SCHOOL YEAR
Phase 1: 16 message arms at ~5,000 each (~80,000) + no-message control (~20,000) = 100,000.
Phase 2: 4 arms at ~25,000 each — Best, Causal Forest, Nearest Neighbor, Random
(within each arm, ~half treated / ~half control).

EXPERIMENT 2 — SUMMER (addendum; not pre-registered)
Phase 1: 4 message arms + no-message control at ~20,000 per cell = ~100,000.
Phase 2: 4 arms at ~25,000 each — Best, Causal Forest, Nearest Neighbor, Random
(within each arm, ~half treated / ~half control).
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
EXPERIMENT 1 — SCHOOL YEAR (Phase 2, ~25,000 per arm) MDE for the personalized-vs-uniform-best (Causal Forest - Best) contrast: ~1.2 p.p. Individual-arm standard error: ~0.31 p.p. Control-arm login rate: ~36% (Phase 2); ~42% (Phase 1). Outcome (Bernoulli) standard deviation at the control mean: ~0.48. MDE expressed relative to the control mean: ~3% of the control login rate (~0.025 SD). EXPERIMENT 2 — SUMMER (Phase 2, ~25,000 per arm; addendum, not pre-registered) MDE for the same contrast: ~0.5 p.p. Individual-arm standard error: ~0.12 p.p. Control-arm login rate: ~2%. Outcome (Bernoulli) standard deviation at the control mean: ~0.14. MDE relative to the control mean: ~25% of the control login rate (~0.036 SD).
IRB

Institutional Review Boards (IRBs)

IRB Name
Cornell University
IRB Approval Date
2023-10-05
IRB Approval Number
0147998

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials