Experimental Design
In order to be eligible for the study, applicants must be an Indiana resident, not have previously earned their high school diploma, not be a sex offender, and be at least 18 years of age. With the exception of the age restriction, these eligibility criteria exactly follow those of TEC.
Early evidence from a pilot study of these interventions indicates that English language learners may react differently to the behavioral interventions than their peers: life coaches have stated that this group is largely seeking English classes, rather than a high school diploma. As such, it may be the case that the intervention is not a good fit for these students, since they are not facing the same barriers that the intervention seeks to address. Therefore, the research team plans to (1) study English language learners as a sub-population, and (2) based on data from the initial pilot decide with the GW team whether to include this population in the roll-out of the intervention to all ten schools.
The study process begins with potential students filling out an online application form. As part of this form, they are presented with a shortened consent form and given the option of opting out of research participation. Eligible applicants are then randomized with ¼ probability into four groups: a pure control group that receives standard services (i.e., a group orientation and no video message), a group that receives only the video intervention, a group that receives only the one-on-one orientation, and a group that receives both the video and the one-on-one orientation interventions . Additionally, we will randomize all past and existing students of TEC to one of these four groups at the start of the study; this randomization will be stratified based on whether or not their primary language is English and if they have previously enrolled and not just applied for a term at TEC. If these students return to TEC after time away, they will be offered the take-up interventions based on this initial randomization.
We expect nearly full compliance for the video intervention, as it will be texted to the cell phone provided in the application. However, there may not be full compliance for the one-on-one orientation: some individuals may repeatedly miss their scheduled one-on-one orientation dates, in which case they are reverted to the group orientation.
For the analysis, our primary specification estimates the impact of the offer of each intervention, or intention-to-treat (ITT) effects, on outcomes. Since the offer is randomly assigned, any differences in outcomes between groups can be attributed to the intervention offer. The basic specification is (we encase subscripts in parentheses for clarity):
y(ist)= β(1)*Video(it) + β(2)*Orientation(it) + τ(st) + γX(it) + ε(ist)
Where y(ist) is one of our primary take-up outcomes (e.g., TEC enrollment or graduation) for applicant i in application term t, where s denotes the applicant’s randomization strata. Video(it) indicates whether or not participant i was sent the video intervention by term t. Similarly, Orientation(it) indicates whether participant i had been assigned to the one-on-one orientation by term t. τ(st) are strata-time fixed effects, which will include a set of application term indicators and two indicators used in the stratified randomization of prior applicants. Lastly, X(it) is a vector of control variables, such as demographics (age, race, gender) and pre-study variables captured in the application form or historical program data. The coefficients of interest – β(1) and β(2) – respectively estimate the average difference in outcomes between each intervention group (video or one-on-one orientation) and the control group, controlling for baseline characteristics.
Using historical (out-of-sample) data, we will predict a given applicant’s likelihood of enrolling, persisting, and graduating from the program. We anticipate that students who enroll in remedial classes and students who are first-time applicants will be less likely to enroll, persist, and graduate from TEC. We will explore heterogeneity by predicted persistence as well as by key predictors of persistence (e.g. enrolled in remedial classes, English Language Learners, etc.) to understand how the program interacts with the barriers applicants face at baseline. Does the program nudge those on the margin of persisting at baseline? And/or does it facilitate persistence among those least likely to persist at baseline?
Given that our main outcomes relate to take-up, we will also report differences in average characteristics of those who enrolled, persisted or graduated across Treatment and Control groups using a LATE framework (Angrist and Imbens, 1996). This approach instruments for program enrollment, persistence (e.g. any receipt of credit), or program graduation with treatment status, and the dependent variable is a characteristic. Characteristics may include race, age, gender, previous applicant status, predicted academic level upon entry, or English Language Learner status. We will pre-specify the characteristics to be included based on pilot data and will adjust for multiple hypothesis testing using randomization inference. This will provide rich information on whether the interventions differentially benefit those who face the most significant barriers.
We will also explore heterogeneity by remedial class status and predicted program completion. GCSC has indicated that students who enroll in remedial courses, particularly for English and math, are at a higher risk of exiting from TEC due to the longer time to completion for diplomas. Therefore, strong treatment effects from the one-on-one intervention may be indicative of its ability to reframe the importance of a high school diploma for this population. Additionally, if English Language Learners are included in the main analysis, we will explore effects for this sub-sample of students, as they face unique barriers relative to the full sample.
Finally, as part of our analysis, we would like to explore treatment effect heterogeneity using machine learning methods, such as LASSO (Chernozhukov et al., 2018; Davis & Heller, 2017). These will allow us to ascertain which sub-populations in the study were most affected by the treatment. The benefit of this approach is that it does not require the researchers to know a priori which characteristics or interactions of characteristics are related to underlying heterogeneity; the algorithm identifies these characteristics while penalizing over-fitting.
Citations:
Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91(434), 444-455. https://www.jstor.org/stable/2291629
Chernozhukov, V., Demirer, M., Duflo, E., & Fernandez-Val, I. (2018). Generic machine learning inference on heterogeneous treatment effects in randomized experiments, with an application to immunization in India. (Working Paper No. w24678). National Bureau of Economic Research. https://www.nber.org/system/files/working_papers/w24678/w24678.pdf
Davis, J. & Heller, S. B. (2017). Using causal forests to predict treatment heterogeneity: An application to summer jobs. American Economic Review 107(5), 546–50. https://www.jstor.org/stable/44250458