Identifying The Spillover and Congestion Effects of Large Scale Information Interventions

Last registered on August 18, 2020

Pre-Trial

Trial Information

General Information

Title
Identifying The Spillover and Congestion Effects of Large Scale Information Interventions
RCT ID
AEARCTR-0006302
Initial registration date
August 17, 2020

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
August 18, 2020, 10:59 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation
Princeton University

Other Primary Investigator(s)

PI Affiliation
Pontificia Universidad Catolica de Chile and J-PAL
PI Affiliation
University of Chicago - Becker Friedman Institute

Additional Trial Information

Status
On going
Start date
2020-08-05
End date
2021-03-31
Secondary IDs
Abstract
The main objective of this project is to study how the effects of information provision vary with the scale of the intervention. We evaluate the effects of an intervention that provides information of product availability and product characteristics on consumer demand. It is hypothesized that the potential changes to consumer demand can vary due to information spillovers when many consumers in the same market are exposed to the information treatment. Heterogeneous effects across types of consumers could increase or decrease differences in the allocation of products across consumers. On the other hand, if many consumers are more informed of their options and behave in a similar way, this may create congestion in the short run, leading some consumers to not be able to get what they wanted even if in the medium run, supply can adjust to accommodate the change in demand. We study these issues in the context of school choice where consumers are families that submit a ranked ordered list to a centralized assignment mechanism and are later assigned to schools depending on a specific set of rules, capacities, and overall demand. This allows for the design of a study where we can in theory disentangle the spillover effects of the information intervention on choices separately from the congestion effects that occur when aggregate demand changes prior to any supply side adjustments. We provide individual families as well as entire neighborhood markets with information about the schools that are nearby. We observe their applications as well as their later assignments and matriculation choices. We can test whether the intervention changed their applications as well as their later assignments. The eligible population for this study consists of all parents from dense urban areas (i.e. those with many schooling options) who participate in the centralized school choice process for the academic year 2021 in Chile. Since our implementing partner is the school choice administration itself, we are able to include almost the entire eligible population as part of our study. Our goal is to help determine where information provision at scale can be helpful at improving match quality and reducing segregation while also taking into account that a scaled-up intervention could have adverse effects on some applicants due to increased congestion in the short run.
External Link(s)

Registration Citation

Citation
Allende Santa Cruz, Claudia, Francisco Gallego and Christopher Neilson. 2020. "Identifying The Spillover and Congestion Effects of Large Scale Information Interventions ." AEA RCT Registry. August 18. https://doi.org/10.1257/rct.6302-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
Control: Basic report card
Parents that belong to the control group will receive a report card with information regarding the results of the admission process in 2019 for schools in their neighborhood (i.e. schools located within 2 km of the reported home address) in addition to detailed information about the schools that parents considered in their application portfolio (this is the POTENTIAL portfolio for those who were recruited before the application period opened, or the ACTUAL portfolio for those who were recruited once they submitted an official application).

Treatment: Basic report card + Detailed report card on schools within 2km + Video
Parents that belong to the treatment group will receive the same report card as the control group plus a second report card that shows 1.) a map with all the schools within 2 km of the reported address of the household and 2.) detailed information of all the schools within 2 km of the household. This means that treated parents will most likely receive information about schools that they had NOT considered in their application portfolio, therefore expanding their consideration set. This group will also receive a video that explains the importance of being informed in order to apply to schools and the returns of getting a good education early in the students’ life.
Intervention Start Date
2020-08-11
Intervention End Date
2020-09-08

Primary Outcomes

Primary Outcomes (end points)
At the individual level:
Mean test scores of schools in application (Primary application outcome)
Test scores of the school student attends in March 2021 (Primary placement outcome)
At the cluster level:
Probability of poor (SEP) student attending a school of higher tertile of quality in the cluster in March 2021 (Pimary distributional outcome)
Primary Outcomes (explanation)
Mean test scores of schools in application: the average score of schools listed, where the score of each one is the average of spanish and math SIMCE (standardized test) from the most recent available year.
Test scores of the school that the student attends in March 2021: the score of the school is the average of spanish and math SIMCE from the most recent available year.
Probability of poor (SEP) student attending higher tertile of quality in the cluster: quality is a value added measure, estimated using a rich set of controls, following the methodology in Neilson 2013/Neilson 2020.

Secondary Outcomes

Secondary Outcomes (end points)
Mean value added of schools in application [secondary outcome]
Mean price of schools in application [secondary outcome]
Mean distance to schools in application [secondary outcome]
Number of schools listed in application [secondary outcome]
Change in application (this is applicable for parents that are recruited through the SAE directly, because they have already submitted an application at the moment we treat them.) [secondary outcome]


Value added of the school they attend in March 2021 [secondary outcome]
Price of the school they attend in March 2021 [secondary outcome]
Distance to school they attend in March 2021 [secondary outcome]

Average number of applications received by schools that belong to the higher within-cluster tertile of quality (school-level outcome) [secondary outcome]
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The eligible population for our study are the parents of students applying to school in an entry grade (Pre-K, Kindergarten, First grade, Ninth grade), in Chile’s dense urban areas, for the academic year beginning in March 2021. As mentioned before, we will be able to include nearly the entire eligible population in our sample, because we are working with the school choice agency directly. In fact, for political restrictions, it is not possible to include only the mentioned grades in our study, and therefore in the implementation we will consider applicants from any grade (Pre-K through 12). However, in the analysis we will only consider those from entry grades. The majority of applicants belong to those entry grades, as expected. See the Pre-Analysis Plan for more details.

From the beginning of August 2020, the school choice portal will display a banner that reads “access personalized information here”. Parents who click on the banner will land on the page that is dedicated to this policy pilot (tuinformacion.mineduc.cl), where they are prompted to sign up. This constitutes the first channel of recruitment. By signing up, they become part of the RCT sample, provided they live inside our selected clusters. While signing up, we will request contact information (email, phone number for SMS and Whatsapp) and, importantly for the randomization, their home address. Once they are logged in, they will be asked to enter the list of all the schools they know and how they wish to rank them in their actual application. All of those who are recruited in this way constitute sample 1, and they will be randomized into treatment by clusters.

Additionally, anyone who completes a real application, will also be part of the experiment, provided they live inside our selected clusters. The difference with these applicants is that they enter the experiment at a moment when they have already submitted a real application. They will be randomized into treatment via clusters, and we will refer to this sample as sample 2.

Additionally, on day three of the application process, we will take all the students from sample 2 that fall inside the control clusters, and we will randomly select a subsample that will be randomized into treatment at the individual level. We refer to this subsample as sample 3. In order to keep the control clusters relatively clean, this sample can not be very large. In practice, this means that there are few “slots” for sample 3, so we will prioritize students that do not have a particularly high switching cost To select this sample, we will proceed as follows. First, we will restrict the universe of applicants to the ones whose decision process is especially interesting for the types of questions our study seeks to answer: applicants choosing schools for the first time (i.e. without a secured spot in their current school), with no siblings enrolled in any school, and with at least 5 schools in a 2 km radius. Then, we will assign a random number to each individual that meets the criteria in the first step, and rank them based on that number. Next, we will proceed to select individuals for sample 3. We will add observations one by one, following the rank. Observations will only be added to the sample if they meet the criteria of being separated by at least 0.5km from other individuals already in the sample.

In summary, the unit of randomization is the cluster for samples 1 and 2, or the parent, for sample 3. The clusters were constructed using the geographic distribution of applications in 2019, employing a Density-based spatial clustering of applications with noise (DBSCAN) algorithm.
Experimental Design Details
The eligible population for our study are the parents of students applying to school in an entry grade (Pre-K, Kindergarten, First grade, Ninth grade), in Chile’s dense urban areas, for the academic year beginning in March 2021. As mentioned before, we will be able to include nearly the entire eligible population in our sample, because we are working with the school choice agency directly. In fact, for political restrictions, it is not possible to include only the mentioned grades in our study, and therefore in the implementation we will consider applicants from any grade (Pre-K through 12). However, in the analysis we will only consider those from entry grades. The majority of applicants belong to those entry grades, as expected. See the Pre-Analysis Plan for more details.

From the beginning of August 2020, the school choice portal will display a banner that reads “access personalized information here”. Parents who click on the banner will land on the page that is dedicated to this policy pilot (tuinformacion.mineduc.cl), where they are prompted to sign up. This constitutes the first channel of recruitment. By signing up, they become part of the RCT sample, provided they live inside our selected clusters. While signing up, we will request contact information (email, phone number for SMS and Whatsapp) and, importantly for the randomization, their home address. Once they are logged in, they will be asked to enter the list of all the schools they know and how they wish to rank them in their actual application. All of those who are recruited in this way constitute sample 1, and they will be randomized into treatment by clusters.

Additionally, anyone who completes a real application, will also be part of the experiment, provided they live inside our selected clusters. The difference with these applicants is that they enter the experiment at a moment when they have already submitted a real application. They will be randomized into treatment via clusters, and we will refer to this sample as sample 2.

Additionally, on day three of the application process, we will take all the students from sample 2 that fall inside the control clusters, and we will randomly select a subsample that will be randomized into treatment at the individual level. We refer to this subsample as sample 3. In order to keep the control clusters relatively clean, this sample can not be very large. In practice, this means that there are few “slots” for sample 3, so we will prioritize students that do not have a particularly high switching cost To select this sample, we will proceed as follows. First, we will restrict the universe of applicants to the ones whose decision process is especially interesting for the types of questions our study seeks to answer: applicants choosing schools for the first time (i.e. without a secured spot in their current school), with no siblings enrolled in any school, and with at least 5 schools in a 2 km radius. Then, we will assign a random number to each individual that meets the criteria in the first step, and rank them based on that number. Next, we will proceed to select individuals for sample 3. We will add observations one by one, following the rank. Observations will only be added to the sample if they meet the criteria of being separated by at least 0.5km meters from other individuals already in the sample.

Last year (2019), there were 483,814 applicants. Out of these, 429,476 belonged to urban areas, and almost all of these belonged specifically to the clusters we are considering for our experiment. We expect the implementation this year to happen with a similar number of participants (divided into 3 samples, as explained before).

The unit of randomization is the cluster for samples 1 and 2, or the parent, for sample 3. Since the cluster is the innovative part of our design, we delve into the details here.

The clusters are constructed using the geographic distribution of applications in the previous year. This is done market by market, in two steps. In the first step, we use a Density-based spatial clustering of applications with noise (DBSCAN) algorithm to classify locations into the ones that are closely packed together (points with many nearby neighbors) and the outlier points that lie alone in low-density regions (whose nearest neighbors are too far away). Then we partition the N observations in the first group into K clusters using a k-means algorithm, which minimizes the within-cluster variances.

Then, clusters are divided into three areas. First, the core of the cluster (see white areas in Figure 5 of the Appendix). This will be our unit of observation for the simulated effects of the policy on school congestion and school quality. Then, we have an intermediate zone. This will be our unit for policy implementation. In treated clusters, all the nodes that fall inside these intermediate zones will be assigned to the treatment for simulations. Finally, we define a buffer zone that goes around the intermediate zone. This (buffer) area between treated and control neighborhoods is needed to have a clean experiment. The smaller the buffer, the more likely it is that there can be spillovers between treatment and control clusters.

For samples 1 and 2, randomization will occur at the cluster level defined above. Additionally, parents from sample 2 that were assigned to control clusters and that meet certain criteria (sample 3) will continue through a second process of randomization at the individual level. The criteria that must be met in order to qualify for sampe three are: to be currently applying for PreK, Kindergarten or 1st grade, to not have older siblings enrolled in a school, and to have at least 5 schools within a 2 km radius. In order to avoid spillovers between treatment and control individuals, we will block a circular area of 0.5km radius around each individual in this sample.
Randomization Method
Randomization is done in an office by a computer.
Randomization Unit
For samples 1 and 2: the cluster
For sample 3: the parent
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
212 clusters (only relevant for samples 1 and 2. Sample 3 is randomized into treatment individually).
Sample size: planned number of observations
around 400,000 parents
Sample size (or number of clusters) by treatment arms
Sample 1 - Treatment: 3,000 parents
Sample 1 - Control : 3,000 parents
Sample 2 - Treatment: 200,000 parents
Sample 2 - Control: 200,000 parents
Sample 3 is a subsample of Sample 2 and has to be selected during the implementation to meet the restrictions we imposed in the design. We estimte it will consist of about 2,000 parents.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
We include here power calculations for our three primary outcomes. Calculations for a much larger number of outcomes can be found in the Pre-Analysis Plan. Baseline levels and estimated number of observations are based on data from the 2019 application process. For all calculations, our alpha was set at the standard 0.05 level, and power was set at 0.8. As mentioned before, the applicants that enter any regression are those who are applying to an entry grade (Pre-K, Kindergarten, First grade, Ninth grade, and to some extent, Seventh grade), so the total N in the calculations considers only the applicants from entry grades in 2019. In the case of sample 1 and sample 2, where randomization is done by clusters, it is also a condition (to enter the regressions) that they live in the core or inner part of the clusters. Buffer regions are not part of the analysis. For all samples, for each outcome, the baseline mean is the mean of the population of all applicants from 2019 that fall inside of the core and inner section of control clusters. We have 212 clusters: 106 assigned to treatment, and 106 assigned to control. In general, we expect the take up of the treatment to be 80% in the treatment group and generally low in the control group (we set it at 10% for the calculations). We adjust our MDEs taking compliance into account (multiplying by 1/(c-s) = 1.42) Since we are able to follow every student in Chile with administrative data, we expect minimal attrition. In our power calculations attrition is set at 5%. Since we have a clustered design (treatment is assigned by cluster), we account for this in the calculations corresponding to samples 1 and 2, and we explicitly indicate our assumptions about intracluster correlation for each outcome. We used an ICC value of 0.1 but sensitivity analyses show that even with up to a 0.5 ICC the MDE is below 0.3. To get an estimate of the size of sample 3 we do the following: we take all applicants from 2019 who fall inside control clusters and count how many meet the conditions (no older sibling, no continuation, 5+ schools around). By looking at the size of the clusters and considering that households in sample 3 have to be 500m apart from each other, we get a rough estimate that the N for this sample will be around 2,000. Sample 1: Mean test scores of schools in application, Control mean: 254.83, MDE: 0.17 (< .001%) Test scores of school where they end up enrolled, Control mean: 249.34, MDE: 0.17 (< .001%) Probability of poor (SEP) student attending a school of higher tertile of quality in the cluster in March 202, Control mean: , MDE: Sample 2: Mean test scores of schools in application, Control mean: 254.83, MDE: 0.15 (< .001%) Test scores of school where they end up enrolled, Control mean: 249.34, MDE: 0.15 (< .001%) Probability of poor (SEP) student attending a school of higher tertile of quality in the cluster in March 202, Control mean: , MDE: Sample 3: Mean test scores of schools in application, Control mean: 254.83, MDE: 0.14 (< .001%) Test scores of school where they end up enrolled, Control mean: 249.34, MDE: 0.14 (< .001%) Probability of poor (SEP) student attending a school of higher tertile of quality in the cluster in March 202, Control mean: , MDE:
IRB

Institutional Review Boards (IRBs)

IRB Name
Institutional Review Board for Human Subjects - Research Integrity and Assurance, Princeton University
IRB Approval Date
2019-09-03
IRB Approval Number
11970
Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials