Experimental Design
We send emails to 7,500 child care facilities in Germany. Each facility receives a single email with the following wording (the original email is in German):
_________
ENGLISH TRANSLATION
Dear Sir or Madam,
We are looking for a childcare place for [our baby's name], who is currently [2/5/8/12/18/24 months] old (see attached picture). As [my wife/husband] and I are planning to move to the area in [April/July/October] and to return to work, we are interested in a childcare place starting in [corresponding month] 2026.
Do you have any slots available? And how can we apply for a slot?
Thank you very much!
Sincerely,
[Mother's name / Father's name]
_________
ORIGINAL GERMAN VERSION
Sehr geehrte Damen und Herren,
wir sind auf der Suche nach einem Betreuungsplatz für [unsere/unseren Babyname], [der/die], derzeit [2/5/8/12/18/24 Monate] alt ist (siehe beigefügtes Foto). Da [meine/mein Frau/Mann ] und ich planen, im [April/Juli/Oktober] neu in die Gegend zu ziehen und wieder arbeiten zu gehen, sind wir an einem Betreuungsplatz ab [entsprechender Monat] 2026 interessiert.
Haben Sie noch einen freien Platz? Und wie können wir uns für einen Platz bewerben?
Vielen Dank!
Mit freundlichen Grüßen,
[Name Mother / Name Father]
_________
While we randomly vary various characteristics in the email above (as indicated), the email’s main variation lies in the race of the parents as signaled via images. Specifically, each email contains an AI-generated image that shows a toddler sitting in the middle and its parents on the left and right side respectively (the toddler’s face is visible, the parents’ is not). We use typical German first and last names in order to not signal race or other characteristics (e.g. SES) via names. The names have been validated via a Prolific survey with German-speaking participants. Using an algorithm to gradually vary toddlers’ race, we vary its race from 0 to 1 in steps of 0.25. Our main dimension of interest lies in varying parents’ race in a binary fashion with the following two conditions:
1. Black father, Black mother
2. white father, white mother
Within each strata, we randomly assign observations to one of the two conditions. In addition, within each strata and condition we assign one fifths of observations to one race category of the toddler (0, 0.25, 0.5, 0.75, 1). Similarly, within each strata and condition, we assign observations to the different email contents (like age of child, start date, etc.).
Our main hypothesis is that Black parents are treated different to white parents, specifically:
H1: Black parents have lower response rates than white parents.
H2: Black parents receive fewer helpful emails than white parents (both conditional and unconditional on receiving a response).
H3: Response times of Black parents are longer (both conditional and unconditional on receiving a response).
Our pre-study’s main aim is to validate that participants view images before making a decision of whether, when, and what to respond. We expect the above hypotheses to hold based on a previous study by Hermes et al. (2023), which varies immigrant/native status via names in an email to German child care facilities. As such, finding differences in response will be interpreted as validation that participants did, in fact, view images before making a decision.
Our study has the following exclusion restrictions: We start with a list of the universe of child care facilities in Germany. Prior to randomization we then, first, exclude all childcare facilities for which we could not identify an email address and, second, all those for which the identified email address was found to be unreachable or false according to Zerobounce, an email testing service. In the experiment, we will further exclude all email addresses that bounce back. Finally, prior to assigning facilities to treatment conditions, we randomly assign 7,500 out of 43,291 (or 17.32%) of child care facilities within each strata to participate in this experiment. The remainder is left for other parts of the experiments.
Methods: we will conduct regressions using a linear probability model. Standard errors will be clustered at the individual level given that each observation is independent and we have no repeated treatment. Our main regression will not include control variables.
Coding of Responses: We will code the helpfulness of responses using human RAs and LLMs on 500 responses. If interrater-reliability is comparable in-between humans to that in-between humans and the LLM, we will rate the remaining responses using the LLM only. If it is not comparable, humans will code the remaining responses.
_____________
References:
Hermes, H., Lergetporer, P., Mierisch, F., Peter, F., & Wiederhold, S. (2023). Discrimination in Universal Social Programs? A Nationwide Field Experiment on Access to Child Care, CESifo Working Paper No. 10368 CESifo Munich.