Discrimination of toddlers – a nationwide field experiment

Last registered on February 10, 2026

Pre-Trial

Trial Information

General Information

Title
Discrimination of toddlers – a nationwide field experiment
RCT ID
AEARCTR-0017856
Initial registration date
February 08, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
February 10, 2026, 6:42 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation
Max Planck Institute for Behavioral Economics

Other Primary Investigator(s)

PI Affiliation
University of Mannheim

Additional Trial Information

Status
In development
Start date
2026-02-16
End date
2026-12-01
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This is the pre-registration for our main study that will study racial discrimination against toddlers (independent of parental race). In our study, we send emails to German child care facilities, mimicking parents asking for information on the application process and a free slot for their child. We use our own AI-generated images to signal both parental and children's race. Other than that, the emails are on average identical across treatment conditions, including names of writers. The main aim of the experiment is to study whether already toddlers would experience discrimination based on their race and, importantly, independent of their parents’ race and other features. To ensure that the race of toddlers is orthogonal to parental features, we leverage natural variation in skin tone of children of mixed-race couples. We therefore always present an A.I. generated picture of mixed race parents to keep any perceived and anticipated effects resulting from parents constant. We manipulate primarily and gradually the race of children to study discrimination and colorism. We further study possible mechanisms of toddler discrimination.
External Link(s)

Registration Citation

Citation
Mill, Wladislaw and Felix Rusche. 2026. "Discrimination of toddlers – a nationwide field experiment." AEA RCT Registry. February 10. https://doi.org/10.1257/rct.17856-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
Intervention Start Date
2026-02-16
Intervention End Date
2026-04-20

Primary Outcomes

Primary Outcomes (end points)
We define three families of outcomes as our primary outcomes of interest:
1) Response rate
2) Helpfulness of responses. We follow Hermes et al. (2023) and code six variables:
a) slot offer
b) placement on waiting list
c) response length (above median length after removal of names, signatures, email histories)
d) helpful content (anything that helps the applicant in searching for a spot, e.g. a contact telephone number, a link to a registration portal or the center’s website, mentions of alternative institutions, or an application form).
e) encouraging: we use LLMs (verified by RAs) to rate if the email is encouraging as a measure of tone
f) recommendation: we use LLMs (verified by RAs) to answer whether they would recommend the child care facility to a friend as a measure of tone.
3) Response time
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
We will run heterogeneity regressions to test potential mechanisms. These include:
- the mentioned start month of the care, to study whether longer time to start might mitigate discrimination.
- the mentioned age of the child, to study whether more cognitive and physical skills interact with discrimination (as one-year-old children of different race might differ less in perception than two-year-olds).
- the AfD vote share to study anti-immigration attitudes
- the regional share of migrants as a measure of contact between migrants and natives
- regional capacity constraints of child care facilities to measure the degree of conflict over scarce resources (measured primarily via our own and external data, if available).

In addition, we run secondary heterogeneity analyses that we expect to be of interest to readers. These are primarily descriptive and run to test the results’ robustness. These include:
- the urbanity (measured in 3 levels) of the region
- the provider type (public, church, other)
- differences by gender of the child
- differences by the gender of the writing parent / the author of the email
- the gender of the Black (white) parent to study whether child care facilities react differently to Black (white) fathers and Black (white) mothers.
- the name of the child
- the name of the parents
- the perceived regional German origin (based on first and last names) of the family and other perceived features of the names of the child and the parents (obtained from a pre-survey).

Finally, we run some tertiary regressions that will likely not be of interest to many readers:
time of day sent, day of week sent, perceptions of pictures, etc.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We send emails to 25,698 child care facilities in Germany. We will send 1000 emails every day. Each facility receives a single email with the following wording (the original email is in German):
_________
ENGLISH TRANSLATION
Dear Sir or Madam,
we are looking for a childcare place for [our baby's name], who is currently [2/5/8/12/18/24 months] old (see attached picture). As [my wife/husband] and I are planning to move to the area in [April/July/October] and to return to work, we are interested in a childcare place starting in [corresponding month] 2026.
Do you have any slots available? And how can we apply for a slot?
Thank you very much!
Sincerely,
[Mother's name / Father's name]
_________
ORIGINAL GERMAN VERSION
Sehr geehrte Damen und Herren,
wir sind auf der Suche nach einem Betreuungsplatz für [unsere/unseren Babyname], [der/die], derzeit [2/5/8/12/18/24 Monate] alt ist (siehe beigefügtes Foto). Da [meine/mein Frau/Mann ] und ich planen, im [April/Juli/Oktober] neu in die Gegend zu ziehen und wieder arbeiten zu gehen, sind wir an einem Betreuungsplatz ab [entsprechender Monat] 2026 interessiert.
Haben Sie noch einen freien Platz? Und wie können wir uns für einen Platz bewerben?
Vielen Dank!
Mit freundlichen Grüßen,
[Name Mother / Name Father]
_________
While we randomly vary various characteristics in the email above (as indicated), the email’s main variation lies in the race of the children as signaled via images. Specifically, each email contains an AI-generated image that shows a toddler sitting in the middle and its parents on the left and right side respectively (the toddler’s face is visible, the parents’ is not). We use mixed-race parents to keep any variation of toddlers' race realistic (and exogenous to other characteristics). We use typical German first and last names in order to not signal race or other characteristics (e.g., SES) via names. The names have been validated via a Prolific survey with German-speaking participants. Using an algorithm to gradually vary toddlers’ race, we vary its race from 0 to 1 in steps of 0.25. Our main dimension of interest lies in varying toddlers’ race. Within each strata, we assign one-fifth of observations to one race category of the toddler (0, 0.25, 0.5, 0.75, 1). Similarly, within each strata, we assign observations to the different email contents (like age of child, start date, etc.).
Our main hypothesis is that darker toddlers are treated different to lighter toddlers, specifically:
H1: Parents of Black toddlers receive fewer responses than white toddlers.
H2: Parents of Black toddlers receive fewer helpful emails than white toddlers (both conditional and unconditional on receiving a response).
H3: Response times of parents of Black toddlers are longer (both conditional and unconditional on receiving a response).

To test these hypotheses, we will conduct simple binary comparisons of Black toddlers (0, 0.25) and white toddlers (0.75, and 1). We also will run linear regressions as a function of toddlers’ race and will estimate non-linear functional forms of race. We will also compute separate coefficients for each toddler race group (0, 0.25, 0.5, 0.75, 1).

Our study has the following exclusion restrictions: We start with a list of the universe of child care facilities in Germany. Prior to randomization we then, first, exclude all childcare facilities for which we could not identify an email address and, second, all those for which the identified email address was found to be unreachable or false according to Zerobounce, an email testing service. In the experiment, we will further exclude all email addresses that bounce back. Finally, prior to assigning facilities to treatment conditions, we randomly assigned 25,698 out of 43,291 of child care facilities, or 60% within each strata, to participate in this experiment. The remainder is left for other parts of the experiments. Further, we exclude all email addresses that have been mentioned in a prior response of others (sometimes the mail might be used by multiple centers, sometimes it might be forwarded), so that no center receives more than one message.
Methods: we will conduct regressions using a linear probability model / linear regressions. Standard errors will be clustered at the individual level given that each observation is independent and we have no repeated treatment. Our main regression will not include control variables.
Coding of Responses: We will code the content of messages and the helpfulness of responses using human RAs and LLMs on 500 responses (measures: helpful content, encouraging, recommendation). If interrater-reliability is comparable in-between humans to that in-between humans and the LLM, we will rate the remaining responses using the LLM only. If it is not comparable, humans will code the remaining responses.
Experimental Design Details
Not available
Randomization Method
Stratified randomization (block randomization) is done using R.
We stratify treatment by state (16 states), urbanity (large city, middle city, other), AfD vote share (median split within state), and provider type (church/public/other). We have a total of potential 288 strata. Because some strata do not exist within a given state (e.g. in city states there is only one level of urbanity), we end up with a total of 227 strata.
Randomization Unit
Child care center level.
Details: We randomly assign units within a single strata to one of the five treatment conditions. Within strata, we ensure that the same number of units is assigned to each of the conditions (with minor differences due to indivisibility of the number of units by five). In addition, we randomly assign other email characteristics within strata, including age and gender of child, requested start date, and gender of the emailing person (father or mother). In addition, within each strata we assign half of the observations to either a Black mother and white father, or vice versa.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
25,698 participating child care centers (we expect a reduction due to emails bouncing back).
Sample size: planned number of observations
We include 25,698 participating institutions, but expect this number to drop due to false or unreachable email addresses, that is, emails bouncing back. These will be removed from the data after the intervention is over. To avoid loss of too many, we checked the availability of email-addresses via Zerobounce prior to the experiment and excluded those flagged as invalid.
Sample size (or number of clusters) by treatment arms
Participating institutions within a given strata are equally distributed across the five main treatment arms (the five race levels of the toddler).
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Regarding our primary outcome (response rate), the study by Hermes et al. (2023) found a native response rate of 71% and a native-migrant gap of 4.4pp. With 80% power and a confidence interval of 95%, our study would require a sample size of 2,584. To ensure that we detect even substantially smaller effect sizes, we substantially increased the sample size to detect even very small differences. Our 25,698 observations allow us to identify gaps of around 1.6pp with the same power and confidence interval. --- References: Hermes, H., Lergetporer, P., Mierisch, F., Peter, F., & Wiederhold, S. (2023). Discrimination in Universal Social Programs? A Nationwide Field Experiment on Access to Child Care, CESifo Working Paper No. 10368 CESifo Munich.
IRB

Institutional Review Boards (IRBs)

IRB Name
Ethics Committee of the University of Mannheim
IRB Approval Date
2025-12-18
IRB Approval Number
EC Mannheim 95/2025