Experimental Design
Evaluation questions
Our main questions are: (i) What are the overall impacts of UCTs on various dimensions of household welfare? (ii) How should UCTs be structured to maximize impact? In particular, how do outcomes vary with three key, yet understudied, design parameters: recipient gender, transfer frequency, and transfer size? The intervention therefore contains the following treatment arms:
1. Transfers to the woman vs. the man in the household. First, half of the transfers were made to the woman, while the other half were made to the man. This feature allows us to identify the differential welfare effects of gender-specific cash transfers.
2. Lump-sum transfers vs. monthly installments. Second, half of the transfers were lump sum, and the other half was paid in 9 monthly installments. By randomizing the month in which the lump sum transfer was made, we kept the discounted present value of the lump-sum and installment transfers similar across the overall lump sum and monthly installment groups.
3. Large vs. small transfers. Finally, a proportion (28%) of the transfers were $1,100 in magnitude, while the remainder were $300. This manipulation allows us to estimate the effect of transfer magnitude on welfare outcomes.
These three treatment arms are fully crossed with each other, except that the $1,100 transfers were made to existing recipients of $300 transfers in the form of a $800 top-up that was delivered as a stream of payments after respondents had already been told that they would receive $300 transfers. The pre-analysis plan outlines how this issue is dealt with in the analysis.
Evaluation Design
Sampling and Identification strategy
To establish a causal relationship between the program and changes in outcomes, this study uses a Randomized Control Trial (RCT). We first identified Rarieda as an intervention area because it has (i) high poverty rates according to census data, and (ii) sufficient M-Pesa access to make transfers feasible. We then identified 100 villages based on the overall prevalence of eligible households in the village. In these villages, we identified 1,500 eligible households, with eligibility determined by residing in a home made of mud, grass, and other non-solid materials. These criteria are simple, objective, and transparent, maximizing accountability. The criteria were not pre-announced to avoid “gaming” of the eligibility rules. We then randomized on two levels -- across villages, and within villages. Specifically, 50 villages were randomly assigned to be treatment villages, while the other 50 were pure control villages. In each of the latter, we surveyed 10 households that did not receive a cash transfer. Within treatment villages, we conducted a within-village randomization: 50% of eligible households were randomly assigned a cash transfer; the other 50% received no transfer (GD will seek to make transfers to this group after the study). This strategy allows us to identify spillover effects (detailed in the pre-analysis plan).
Spillover effects
We use three approaches to quantify spillover effects. First, we used pure control villages to quantify within-village spillovers. Comparing control households in treatment villages to those in pure control villages identifies within-village spillover effects. Second, we identify spillover effects across villages. Note that these effects could potentially be even more pronounced than within villages, if, for instance, entire villages are affected by weather shocks. Using GPS data on village location, we can identify cross-village spillovers, under the assumption that these spillovers are geographically correlated. Third, a separate village-level survey elicited general equilibrium effects of the intervention at the level of the local economy; we surveyed residents of the village on prices, labor supply, wages, crime, investment, community relations (e.g. perceived fairness of targeting criteria) and power dynamics.
Data collection methods and instruments
Data was collected at baseline and one year after the intervention. A midline with a subset of questions was administered to a sample of respondents each month after the intervention. Trained interviewers visited the households; both the primary male and the primary female of the household were interviewed (separately). Surveys were administered on Netbooks using the Blaise survey software. Following standard IPA procedure, we performed backchecks consisting of 10% of the survey, with a focus on non-changing information, on 10% of all interviews. This procedure was known to field officers ex ante. Saliva samples were collected using the Salivette (Sarstedt, Germany), which has been used extensively in psychological and medical research, and more recently in randomized trials in developing countries similar to this one. It requires the respondent to chew on a sterile cellulose swab, which is then centrifuged and analyzed for salivary cortisol.
Power calculation
The sample size of 500 individuals in each of the treatment, control, and pure control conditions was chosen based on a power calculation, which showed that a sample of 1,000 individuals is sufficient to detect effect sizes of 0.2 SD for all treatment vs. pure control households with 89% power. Different treatment arms within the treatment groups (male vs. female recipient, lump-sum vs. monthly, large vs. small transfers) can be compared with 60% power.
Risk and treatment of attrition
Attrition was not a significant concern in this study because it became evident early on in GD’s work in Kenya that respondents were highly interested in maintaining relations with GiveDirectly in the hope of receiving future transfers (although these are never promised). Nevertheless, we used five approaches to control attrition. First, the survey contained a detailed tracking module developed by Innovations for Poverty Action (IPA), the NGO implementing the fieldwork. IPA and GD collaborated closely throughout the study to facilitate tracking. Second, we incentivized survey completion through a small appreciation gift (a jar of cooking fat); in addition, respondents earned money from the economic games in the survey. Third, in our power calculations, an attrition rate of 20% would still result in a power of 80%. In fact, we only observed 3% attrition between the two visits of the baseline. Finally, we control for attrition econometrically in the analysis, as detailed in the pre-analysis plan.