Primary Outcomes (end points)
Outcomes and Mechanisms
To assess the effectiveness of the pen-pal program, we collect survey data at baseline and endline. The effects of the intervention are measured immediately after the intervention and in a 15 month follow-up. These survey-based outcome measures are complemented by experimental data collected at endline.
Survey Outcomes
Inter-group Attitudes and National Identity
We collect self-reported survey outcomes on key dimensions of inter-group relations: prejudicial attitudes, compassion, trust, and collaboration. Prejudicial attitudes are measured using a short version of the Generalized Group Attitude Scale (Maiti et al., 2022). Compassion is assessed using the Compassion Scale (Pommier, 2019). Trust is measured using the standard trust scale from the German Socio-Economic Panel (SOEP), and collaboration is captured through four statements emphasizing willingness to work with members of the other ethnic group. Responses are recorded on a ten-point scale (1 = completely disagree, 10 = completely agree).
Items are aggregated to form composite indices for each construct. Higher scores indicate more progressive attitudes, except for the prejudice scale, where higher values indicate less progressive attitudes.
We additionally measure national identity using a binary indicator similar to Bagues and Roth (2023). Respondents report whether they identify more strongly with their local region or with Sri Lanka. The indicator equals one if the respondent identifies more strongly with Sri Lanka.
All multi-item outcomes are standardized following Kling, Liebman, and Katz (2007). Indices are constructed as equally weighted averages of z-scores, where z-scores are calculated using the control group mean and standard deviation.
Generalized Trust, Altruism, and Parochialism
To assess whether the intervention affects broader pro-social dispositions, we draw on items from the Global Preference Survey (Falk et al., 2018) to measure self-reported altruism and generalized trust.
Parochial versus universal altruism is measured through an allocation task in which respondents divide a fixed endowment between a randomly selected person from their local region and a randomly selected person from Sri Lanka. The parochialism measure is defined as the difference between the allocation to the local and national recipient (regional minus national allocation). Higher values indicate greater parochialism.
These measures are standardized using control group means and standard deviations. We treat them as secondary outcomes.
Potential Mechanisms: Information, Emotions, and Empathy
We explore informational and emotional channels of the intervention. Information is measured using an augmented version of the Intercultural Competence Index (Siddique et al., 2020). Empathy and emotional responses are measured using the Interpersonal Reactivity Index (Davis, 1980).
Experimental Outcomes
At endline, we complement survey measures with three incentivized experimental tasks: a Prisoner’s Dilemma, a Dictator Game (including social norm elicitation), and a donation decision. Except for the donation task, decisions are made both with an in-group and an out-group partner (within-subject design; order randomized). One decision is randomly selected for payment. Payoffs are made in tokens redeemable for school supplies.
Prisoner’s Dilemma
Participants are endowed with four tokens and choose whether to keep them (“defect”) or transfer them to their partner (“cooperate”). Transferred tokens are doubled for the partner. Cooperation is measured as a binary indicator equal to one if the participant transfers their endowment. The game is played once with an in-group partner and once with an out-group partner.
Dictator Game and Social Norms
Participants receive four tokens and decide how many tokens (0–4) to allocate to an anonymous recipient. The number of tokens transferred measures altruistic behavior and is elicited separately for in-group and out-group recipients.
To measure social norms (following Krupka and Weber, 2013), participants rate the social appropriateness of allocations of 0, 2, and 4 tokens on a four-point scale (very socially inappropriate to very socially appropriate). They also provide incentivized guesses of the modal appropriateness rating among classmates for each allocation. This allows us to distinguish personal normative beliefs from perceived peer norms.
Donation Game
Participants decide how many of four tokens to donate to a national environmental NGO versus keeping for themselves. The number of tokens donated measures willingness to contribute to a public good.
Sentiments of Written Letters
To examine mechanisms, we analyze the content of exchanged letters using natural language processing techniques. We apply the VADER sentiment model (Hutto and Gilbert, 2014) to compute positive, negative, neutral, and compound sentiment scores. The compound score (ranging from −1 to +1) serves as our main sentiment measure. As a robustness check, we use LIWC (Tausczik and Pennebaker, 2010) to measure emotional expression (positive emotion, negative emotion, anxiety, anger, and sadness) in the letters.