Experimental Design Details
Research Questions
The study addresses three related questions. First, how accurately do smallholder farmers recall past rainfall, and what systematic biases characterize memory distortion — recency overweighting, extremity bias, and salience effects from memorable events? Second, does providing localized objective rainfall histories improve forecast accuracy and shift agricultural behavior toward better-adapted choices? Third, are belief-relative shocks, deviations between subjective expectations and realized rainfall, better predictors of agricultural production losses than conventional weather anomalies based on historical means?
Identification Strategy
The experiment uses three sources of variation for identification. The B vs. A contrast identifies the causal effect of information provision on belief updating and end-of-season agronomic outcomes, using village-level randomization with individual-level data. The within-person contrast between Round 1 and Round 2 beliefs in Group B identifies the magnitude and direction of belief updating in response to the specific information signal received, using Group A's Round 1 to Round 2 change as the counterfactual for time trends and regression to the mean. The C vs. A contrast identifies anchoring — the effect of objective historical data on the starting point of belief formation before any subjective recall is elicited — using Group A's uninformed Round 1 beliefs as the baseline.
Survey Instrument
The survey is implemented via KoBoToolbox (CAPI) on tablets. It is organized into the following modules, with routing by treatment arm:
Memory module (all arms): for each of the five past agricultural seasons (2021–2025), respondents allocate 10 tokens across nine rainfall images to express their recalled probability distribution, report onset timing beliefs across ten calendar intervals, rate their confidence on a 1–5 scale, and answer binary questions about whether they remember extreme conditions in that season.
Pre-information expectations module (Arms A and B only): respondents complete the same token allocation tasks for the upcoming 2026 season, yielding Round 1 belief distributions over rainfall amount and onset timing, along with confidence ratings and extreme weather expectations.
Intermediate module (all arms): questions on extreme weather events experienced in past seasons, factors expected to affect rainfall this season, social networks and information sharing, current adaptation plans, types of adaptation, and estimated adaptation costs.
Information treatment (Arms B and C only): enumerator presents the village-specific laminated information card. Respondent is given time to examine the card; enumerator answers clarifying questions only, providing no interpretation or advice.
Post-information expectations module (Arms A and B): Round 2 token allocation over rainfall amount and onset timing, confidence rating, extreme weather expectations, updated adaptation plans and costs. For Group A this round serves as a test-retest reliability check; for Group B it is the post-treatment elicitation. Question labels are neutral and do not reference the information treatment.
First and only expectations module (Arm C): identical in content to the post-information module above, with neutral labels that do not indicate this is a first measurement, to avoid demand effects.
Estimating Equations
The primary intention-to-treat specification is:
Y_iv = tau_Y * Info_v + X'_iv * gamma + lambda_s + u_iv
where Y_iv is the standardized end-of-season EVI2 at farm i in village v, Info_v is a binary treatment indicator equal to 1 for Groups B and C, X_iv is a vector of pre-specified baseline covariates (farming experience, land area, education, information access, pre-information belief entropy, memory divergence), lambda_s are randomization stratum fixed effects, and u_iv is the error term clustered at the village level. Wild cluster bootstrap p-values with Rademacher weights are the primary inference statistic. Conley spatial HAC and randomization-inference p-values are reported as robustness checks.
Belief updating is estimated at the category level as:
Delta_p_{i,k} = phi_0 + phi_1 * (r_{v,k} - p^pre_{i,k}) + mu_i + eps_{i,k}
where Delta_p_{i,k} is the change in token share for category k between Round 1 and Round 2, r_{v,k} is the objective historical frequency for that category from the information card, and p^pre_{i,k} is the Round 1 belief. phi_1 is the elasticity of updating to the information signal. A mixture model additionally estimates the weight placed on the external signal relative to the memory-based prior.
The belief-relative shock horse race regresses agronomic outcomes jointly on the subjective surprise measure S_i = 1 - F_i(k_i^R) and the conventional rainfall anomaly Z_i constructed on the same onset-aligned 9-category support, comparing explanatory power via R-squared decomposition and out-of-sample prediction.
Inference
All primary specifications use: village-clustered wild bootstrap p-values (Rademacher weights, 9,999 draws) as the primary inference statistic; Romano-Wolf stepdown q-values within pre-declared outcome families; Conley spatial HAC standard errors using village centroids as robustness; and randomization-inference p-values for treatment indicator coefficients. The three pre-declared outcome families (agronomic performance, belief precision, belief-relative shock) are tested jointly with Romano-Wolf correction. All exploratory analyses are clearly labeled and not subject to multiple testing correction.
Timeline
Pre-season survey and information treatment: March 2026. End-of-season satellite outcome measurement: approximately August–September 2026 following harvest. Analysis and paper draft: late 2026.