Back to History

Fields Changed

Registration

Field Before After
Abstract We study how childcare decision-makers respond to email inquiries from parents. In a US field experiment with daycare centers, we send an email that varies (i) whether the message frames the situation as an exogenous emergency (“we unexpectedly moved”) or as an endogenous oversight (“we dropped the ball”), and (ii) whether the sender is a mother or a father (by using a male or female sounding name). We measure if and how centers respond (reply rate and speed), whether they offer an appointment or a spot (and timing), and the tone/helpfulness of replies. Some centers are randomly assigned to receive a neutral follow-up five days after the initial email if no reply has been received. We study how childcare and eldercare decision-makers respond to email inquiries from parents or children. In a US field experiment with daycare and eldercare centers, we send an email that varies (i) whether the message frames the situation as an exogenous emergency (“our caregiver is moving unexpectedly”), as an endogenous oversight (“we dropped the ball”), or does not specify a reason reason (ii) whether the sender is male or female (by using a male or female sounding name) and (iii) whether the sender is a single parent in the case of childcare. We measure if and how centers respond (reply rate and speed), whether they offer an appointment or a spot (and timing), and the tone/helpfulness of replies.
Last Published September 01, 2025 03:19 PM February 26, 2026 11:58 AM
Randomization Unit Daycare center. Daycare or eldercare center.
Planned Number of Clusters We plan to include approximately 2,000 daycare centers in the pilot study. Each daycare center represents one cluster, as responses are measured at the center level. Following the pilot, we will update our power calculations based on the observed effect sizes. We included about 2000 daycares in the first pilot study. And we plan to include approximately 3000 daycare centers and 1500 eldercare centers in the second pilot study. Each center represents one cluster, as responses are measured at the center level. Following the pilot, we will update our power calculations based on the observed effect sizes.
Planned Number of Observations Because each daycare center is contacted only once (with a follow-up in the encouragement arm if no response), the total number of observations will also be 2,000 daycare centers for the primary outcomes in the pilot. Each observation corresponds to one unique center’s response (or non-response) to our email. The secondary outcomes (e.g. sentiment of the email response) will be determined by the response rate to our various treatments. Because each center is contacted only once, the total number of observations will also be 2000 & 4500 carecenters for the primary outcomes in the first and second pilots. Each observation corresponds to one unique center’s response (or non-response) to our email. The secondary outcomes (e.g. sentiment of the email response) will be determined by the response rate to our various treatments.
Sample size (or number of clusters) by treatment arms We are running a 2×2 factorial design with four treatment arms with equal size samples in each treatment: T1_M: “Unexpected move” message, male sender T1_F: “Unexpected move” message, female sender T2_M: “Dropped the ball” message, male sender T2_F: “Dropped the ball” message, female sender We plan to have 2,000 centers in the pilot. We are running a 3×6 factorial design with eighteen treatment arms with equal size samples in each treatment: T1_M_S: “Unexpected Need” message, male sender, single parent T1_F_S: “Unexpected Need” message, female sender, single parent T2_M_S: “Dropped the ball” message, male sender, single parent T2_F_S: “Dropped the ball” message, female sender, single parent T3_M_S: No reason message, male sender, single parent T3_F_S: No reason message, female sender, single parent T1_M_N: “Unexpected Need” message, male sender, non-single T1_F_N: “Unexpected Need” message, female sender, non-single T2_M_N: “Dropped the ball” message, male sender, non-single T2_F_N: “Dropped the ball” message, female sender, non-single T3_M_N: No reason message, male sender, non-single T3_F_N: No reason message, female sender, non-single T1_M_C: “Unexpected Need” message, male sender, child T1_F_C: “Unexpected Need” message, female sender, child T2_M_C: “Dropped the ball” message, male sender, child T2_F_C: “Dropped the ball” message, female sender, child T3_M_C: No reason message, male sender, child T3_F_C: No reason message, female sender, child We plan to have 2000 centers and 4500 centers in the first and second pilots.
Power calculation: Minimum Detectable Effect Size for Main Outcomes Binary outcomes (reply yes/no): With n=500 centers in each treatment arm (four arms total, N=2000), a two-sided α=0.05 and 80% power, the Minimum Detectable Effect (MDE) for arm-to-arm comparisons is approximately 7–9 percentage points depending on the true baseline reply rate For example, if the baseline reply rate is 20%, the MDE is ~7.1pp; if 30%, ~8.1pp; if 40%, ~8.7pp; if 50%, ~8.9pp. These minimum detectable effects are appropriate for a pilot; the full-scale experiment will be powered to detect smaller effects once expanded. See pre-analysis plan for power calculation details
Intervention (Hidden) The main intervention is whether US daycare centers respond differently to inquiries from fictitious parents depending on (a) how the message frames the childcare request (unexpected relocation vs. “we dropped the ball”) and (b) whether the sender is a mother or a father. This is a 2×2 between-subject factorial design. In addition, at the daycare-center level, 50% of centers are randomly assigned to receive a follow-up email 5 days later (if no reply has been received). This serves as a randomized encouragement to respond and is independent of the content of the original message. We can use this as an instrument for selection into receiving any response from a center. Two weeks after the original email is sent, we will send an email informing the center that alternative childcare arrangements have been made and we no longer need their assistance. All details of email wording, sender identities, domains, and procedures are documented in the protocol and Pre-Analysis Plan (PAP). The main intervention is whether US daycare and eldercare centers respond differently to inquiries from fictitious parents or children depending on (a) how the message frames the request (unexpected daycare closing or caregiver moving vs. “we dropped the ball” vs. no framing) (b) whether the sender is male or female and (c) whether the sender is a single parent or not in the case of childcare. This is a 2×6 between-subject factorial design. In addition, at the daycare and eldercare-center level, the time and day of the week that each email is sent is randomized. We can use this time and day variation as an instrument for selection into receiving any response from a center. About 23.5 hours after any response from a childcare or eldercare center is received, we will send a response email informing the center that alternative arrangements have been made and we no longer need their assistance. For any facility that has not responded after two weeks of receiving an initial email, we will send an email informing them that “We have made other arrangements.”
Secondary Outcomes (End Points) · ReplyLatency: Continuous measure of time to reply (in hours or days) from when the email was sent. This is the primary definition for latency and will be used in all main analyses. · Tone / sentiment of response. · Length of reply (word/character count). . Content: Measures based on the content of the response (e.g. offer to schedule a tour/appointment, provision of alternative options or helpful resources, etc. Note on robustness: While the primary definition of ReplyLatency is continuous, for robustness and interpretability we may also categorize response time into bins (e.g., within 1 day, within 7 days, or before/after the 5-day encouragement follow-up). · ReplyLatency: Continuous measure of time to reply (in hours or days) from when the email was sent. This is the primary definition for latency and will be used in all main analyses. · Tone / sentiment of response. · Length of reply (word/character count). . Content: Measures based on the content of the response (e.g. offer to schedule a tour/appointment, provision of alternative options or helpful resources, etc.) Note on robustness: While the primary definition of ReplyLatency is continuous, for robustness and interpretability we may also categorize response time into bins (e.g., within 1 day, within 7 days).
Back to top

Irbs

Field Before After
IRB Name Tufts SBER IRB
IRB Approval Date February 19, 2026
IRB Approval Number STUDY00006726
Back to top