Experimental Design
As described above, we conducted a baseline survey during May-July 2023 in five districts (Kalahandi, Mayurbhanj, Bolangir, Ravagada, and Ganjam) of Odisha state, India. From each selected GP within these districts (see section \ref{siteselection}), we selected 15 households at random from among all registered MGNREGA job card holders. In each household, our primary respondent was the woman who had worked the highest number of days on the program over the five-year period preceding the survey. If there was more than one such woman, we selected the youngest. If there was no woman in the household who had worked on the program over the five-year period, we selected the wife of the man who had worked the largest number of days over the same period. Our target sample was 3750 women from 250 GPs. At baseline, we were able to interview 3,426 women from 230 GPs. 20 GPs in Ganjam district refused to allow the survey to be conducted.
Based on our intervention budget, we selected 1,400 ``target'' women from the baseline with whom to conduct our interventions. We chose them as follows. First, we made the decision to discontinue the study in Ganjam district. Both the baseline survey and some additional qualitative data we collected suggest that the MGNREGA program operates in a top-down manner. This means that our planned interventions are less likely to be relevant in this context. Our baseline survey also shows that sample households in Ganjam are significantly wealthier than households in other districts, a characteristic generally associated with a lower demand for MGNREGA work and assets. Finally, the high refusal rate at baseline made it likely that our interventions would be harder to implement here. The baseline sample in the remaining four districts was 2,982 women, spread across 200 GPs in 60 blocks (an administrative unit above the GP). Second, we reduced the sample size by including those 94 GPs (with 14 additional GPs selected as back-ups) in the remaining 4 districts in our study—Bolangir, Kalahandi, Mayurbhanj, and Rayagada—where travel times were lowest for our survey firm. This helped us substantially with the cost effectiveness of the project. We instructed our implementing partners to always target all 15 women in any GP entered, and to stop once they reach the target of 1400 women included.
We randomized our three study conditions (placebo, T1, T2) at the target woman level, stratifying on GP (there are generally 15 women in each GP, so roughly 5 women in each treatment arm). We used STATA's randtreat command and handled misfits globally. Of the 3-5 friends the target woman gathers, the first one to arrive who is herself an MGNREGA job card holder was administered a short (about 7 minutes) pre-treatment questionnaire, and, along with the target woman, was re-interviewed at endline. Thus, our design is effectively clustered at the group level - where a group is comprised of the target women plus her interviewed friend; this also comprises our sample for analysis.
We will analyze outcomes, Y_igt, using an ANCOVA specification, as follows:
Y_igt = Beta_0 + Beta_1 T1_ig + Beta_2 T2_ig + Beta_3 Y_ig,t-1 + Delta_g + F_ig + X_igt + Error_ig
where i indexes women, g indexes GPs, t indexes time (t-1 is pre-treatment, t is endline), T1 is an indicator for assignment to treatment 1, T2 is an indicator for assignment to treatment 2, and F is an indicator for being a friend of the target woman (as opposed to the target woman herself). X_igt is a vector of control variables. As we stratify on GP, we include GP fixed effects. The effects of each treatment are relative to the base group (assignment to the placebo group, with only an information treatment).
Our main specification will include all women (target women plus friends and neighbors) and will cluster standard errors at the level of the target woman-friend pair, to account for shared unobservable characteristics. Secondary specifications will include only target women or only friends. We will estimate a specification as well that interacts the treatment indicators with the indicator F to test if we can reject the null hypothesis that the effects of our treatment on outcomes of interest are the same for target women and their friends/ neighbors.
We lack baseline data for some of our outcomes for all women, and for other outcomes, we have baseline data only for target women but not friends/ neighbors of target women. Where we have a baseline value of an outcome, we will control for it; where we lack this, we will not do so.
In our primary specification, our vector of control variables X_igt will include only strata fixed effects. However, we will also estimate specifications which include a vector of the pre-treatment values of several control variables (to increase precision). These include the respondent's age, marital status, a vector of occupation dummies, household head caste dummies, household head religion dummies, a dummy for someone in the household having migrated in the year prior to the baseline survey, i.e., between Holi 2022 and Holi 2023 , and the number of acres of agricultural land owned.
We will estimate heterogeneous treatment effects along four dimensions. For all heterogeneous treatment effects, we test effects on the set of primary outcomes only (both aspirations and behaviors). First, whether or not a woman is a member of a SHG (question G.1). We expect women members of SHGs to have substantially more social capital compared to women who are non-members, and further anticipate that this pre-existing social capital will be a complement to (rather than a substitute for) the information and skills provided by our treatments. Second, by the baseline level of the engagement with the MGNREGA program, measured by the number of days the respondent or others in their household worked on the program in the pre-baseline period. A priori we might expect effects to be larger among those workers who were already actively involved in the MGNREGA prior to the intervention. Studying heterogeneous effects along this dimension allows us to differentiate between impacts on the intensive margin (among previously active program users) and on the extensive margin (expanding the set of those who actively engage with the program). Third, we plan to interact the treatment with whether or not a woman is age 35 and below to understand whether the treatments have differential impacts for younger and older women, respectively. Finally, if women at baseline believe that local institutions are dominated by elites, they may believe that participating in such a training makes can make little difference. We additionally estimate heterogeneous treatment effects for our primary outcomes using an indicator for whether women believe that their Gram Panchayat is dominated by a small group of elites (I.3C in baseline survey). We create the dummy variable based on whether women believe that elite domination characterizes their Gram Panchayat "to a great extent.'' All other values are coded as zero.
Our paper outlines five classes of outcomes: women's aspirations (primary); women’s behaviors related to exercising voice and agency (primary); potential backfire effects of treatments (secondary); pathways related to information and skills (secondary), and pathways related to social support and gender norms (secondary). We will correct for multiple testing within each of these broad families of outcomes, separately--controlling the false discovery rate following Anderson (2008).