Who Guards the Data? Enumerator Effort, Honesty, and Survey Quality?

Household surveys are central to empirical research, yet little is known about how professional survey enumerators---frontline agents in data production---shape interview administration, data quality, and content of survey responses. Using randomized assignment of enumerators to respondents, we test whether survey outcomes differ across respondents assigned to enumerators with different pre-fieldwork incentivized measures of effort, honesty, and views on politics and governance. Before fieldwork, we elicit enumerators’ effort and honesty using incentivized tasks: a real-effort task that measures diligence in a routine but demanding exercise, and a private-reporting task that measures willingness to misreport private information for monetary gain. We also elicit enumerators’ cognitive and socio-emotional skills, as well as their own answers to political and governance questions included in the respondent survey. We then link enumerators to the interviews they conduct in large surveys conducted in Ethiopia and Kenya. The preregistered analysis tests whether enumerator effort, honesty, and their skills and perceptions affect survey data and survey responses. By combining incentivized measures with randomized field assignment, the study provides evidence on how traits, skills, and views of enumerators shape survey data, with implications for recruitment, training, and supervision in large-scale field studies.

External Link(s)

Registration Citation

Citation

Abate, Gashaw T et al. 2026. "Who Guards the Data? Enumerator Effort, Honesty, and Survey Quality?." AEA RCT Registry. June 03. https://doi.org/10.1257/rct.18747-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

The study is being embedded in forthcoming large-scale household surveys in Ethiopia and Kenya. The intervention involves random assignment of enumerators to respondents as well as a lab-in-the-field experiment that entails eliciting honesty and effort levels. The scale and structure of the surveys, which allow random assignment of enumerators across respondents, provide a unique opportunity to measure the effect of enumerators’ important behavioral attributes (including honesty and effort) on survey data quality.

Intervention One: Random Assignment of Enumerators to respondents
The key intervention of this study is the random assignment of enumerators to respondents. This follows recent studies aiming to study the effect of enumerators (Di Maio and Fiala, 2020; Kerwin and Ordaz Reynoso, 2021; Rodriguez-Segura and Schueler, 2023; Kerwin and
Ordaz Reynoso, 2021; Kadam et al., 2025;West and Blom, 2017). We do this random assignment in two stages: First, we randomly assign enumerators to enumeration areas (villages). Second, within each enumeration area (village), we randomly assign respondents to those enumerators assigned to the corresponding enumeration area.

Intervention Two: Measuring survey enumerators’ honesty and effort levels as well as personality traits, cognitive skills, sociomotional skills and political opinions.

Beyond the random assignment of enumerators, we also aim to experimentally measure and elicit enumerators’ honesty and effort levels. These lab-in-the-field exercises will be administered to all enumerators and supervisors at the beginning of the surveys. All experimental sessions will be conducted in a manner that allows privacy and ensures adherence to the protocol and consistency across participants.

To measure enumerator honesty, we use a private coin-tossing task. The design follows the approach of Barfort et al. (2019) and Fischbacher and F¨ollmi-Heusi (2013), adapted to the survey context. For logistical simplicity, we implement a coin-tossing task instead of a dice-rolling task, as in related studies (e.g., Abeler et al., 2014; Cohn et al., 2014; Janezic and Gallego, 2020). In each round, participants privately toss a coin and report the realized outcome. Reporting heads yields a monetary payoff, while reporting tails yields zero. As the realized outcome is privately observed and payment depends only on the reported outcome, participants have a monetary incentive to misreport realized outcomes by overstating the number of heads.

We elicit effort using an incentivized zero-counting task administered before fieldwork. The task rewards careful attention and perseverance in a routine but demanding exercise (Abeler et al., 2011; Charness et al., 2018; Koch and Nafziger, 2020; Kaiser et al., 2024). The zerocounting exercise is a widely used and incentive-compatible measure of real effort (Charness et al., 2018). This task is repetitive and routine tasks that resemble key features of survey data collection.

Finally, to examine whether enumerators’ personality traits (e.g., Big Five), cognitive (Raven’s test) (Raven et al., 1998), and socio-emotional skills, political opinions, and trust in local and national governments are associated with survey responses, we administer selected modules and questions that are also included in the respondent survey. These include trust in various actors
in communities and governments, as well as related political opinions.

Intervention Start Date

2026-05-26

Intervention End Date

2026-09-30

Primary Outcomes

Primary Outcomes (end points)

1. Enumerator’s real-world preferences and tendencies for unethical behavior and self-reported measures of effort
2. Result of comprehension test
3. Attrition or non-response rates
4. Interview duration or average number of interviews completed
5. Distance covered by enumerators within each village to capture enumerator’s effort.

Primary Outcomes (explanation)

1. Unethical preferences, decisions, attributes and responses}: we aim to elicit enumerators' preferences for monitoring and tendencies to pursue unethical decisions, including operating without a formal license, avoiding tax payments, and reported moral values associated with paying bribes. For this purpose, we ask enumerators about their formal license to operate as well as their tax payment history. Similarly, we collect their moral values associated with paying bribes and avoiding fare on public transport.

We also elicit enumerators' Big Five personality and related psychometric indicators and measures. Along with this, we also elicited self-reported measures of effort, grit, and perseverance.

Finally, we administer small parts of the questionnaire meant to be addressed to respondents to elicit enumerators' responses to these questions. We focus on subjective questions and opinions related to political views and trust.

2. Result of comprehension test: For each survey, we aim to attract a larger number of enumerators than we actually use. This gives us the opportunity to identify the best enumerators out of the pool of trained enumerators. For this purpose, we will conduct a comprehension test at the end of the enumerators' training, immediately before deploying the enumerators to the field. We aim to use the comprehension test to select the most qualified enumerators. The test result will be a continuous outcome, which will also help us to identify those scoring the passing mark and those who do not. Thus, we aim to use both continuous and binary measures associated with the test result.

3. Interview duration - survey timestamps recorded for the whole duration of the survey, as well as at each module, will be used for measuring survey duration. Module-level duration is used to assess enumerator effort, fatigue, and potential data quality concerns, as unusually short or long completion times may indicate skipping, rushing, or respondent disengagement.

4. Distance covered by enumerators within each village - We aim to assemble the distance covered by each enumerator in each village using geo-reference of survey respondents' (household) residence. This helps us to capture the level of effort exerted by each enumerator. For this purpose, we aim to compute Euclidean distance between respondents covered by each enumerator and consider various forms of aggregation of these distances within each village.

5. Share of missing (non-consented) respondents: This outcome measures the number of households recorded as unavailable, absent, or not found during fieldwork. A relatively large share of missing respondents might be driven by systematic avoidance behavior or lack of effort, including if they are located too far.

Secondary Outcomes

Secondary Outcomes (end points)

1. Number of affirmative responses in roaster items or ranking tasks.
2. Objectively measured performance reports by auditors and supervisors.

Secondary Outcomes (explanation)

Experimental Design

Measuring enumerator honesty
We will conduct a lab-in-the-field experiment with enumerators involved in survey data collection in Ethiopia and Kenya. The design follows the approaches in experimental literature on dishonesty (e.g., Barfort et al., 2019; Fischbacher and F¨ollmi-Heusi, 2013) to measure enumerators’ propensity for dishonesty, and is adapted to our context. For logistical simplicity, we will implement a coin-toss task instead of a dice-rolling task, consistent with prior studies (e.g., Abeler et al., 2014; Cohn et al., 2014; Janezic and Gallego, 2020).

Approximately 80–90 enumerators, including supervisors, will participate across the two countries. Participants will be informed that the task is designed to understand decision-making among survey enumerators engaged in data collection activities and to test how well they guess in situations characterized by randomness. They will also be informed that they earn money based on their reported outcomes. Participants will be explicitly informed that the research team cannot observe the outcome of any coin toss and will be instructed to toss the
coin privately, without the presence of anyone else.

Participants will complete six practice rounds to familiarize themselves with the procedure before starting the actual incentive-compatible exercise. The main coin-tossing exercise and tasks consist of 30 independent rounds. In each round, participants privately toss a coin and
then report their outcome.

The experiment uses local currency. To standardize the task across participants and minimize concerns about coin availability or biased flips from worn or damaged coins, all participants will receive a new coin of the corresponding denomination obtained from a bank. In Ethiopia, participants receive a one-Birr coin, with the lion side defined as heads and the weighing-scale side as tails. In Kenya, participants receive a 20-shilling coin, with the African bush elephant (Ndovu) side defined as heads and the Kenya Coat of Arms (National Crest) side as tails (see Fig. A.1).

Reporting heads pays 50 Birr per round in Ethiopia and 80 KES per round in Kenya, whereas reporting tails pays zero. The Kenya payment is designed to be broadly comparable to the Ethiopian payoff in purchasing-power-parity terms: 50 Birr ≈ 80 KES.2 Since payment
depends only on the reported outcome, participants have a monetary incentive to misreport the realized outcomes by overstating the number of heads.

The true outcomes are not observed by the research team, so this task creates a clear incentive to misreport. Cumulative monetary payments increase with the number of reported heads. Prior studies show that (dis)honesty measured through this approach predicts real-world unethical behavior and corruption (Fischbacher and F¨ollmi-Heusi, 2013; Barfort et al., 2019; Olsen et al., 2019; Cohn and Mar´echal, 2018).

Under truthful reporting, the probability of reporting heads in any given round is 50 percent. Enumerator honesty is measured by comparing the distribution of reported heads to the distribution implied by truthful reporting under random chance. Deviations from this benchmark
distribution will be used to construct measures of enumerators’ propensity to misreport.

Measuring enumerator effort
We measure enumerator effort using an incentivized real-effort task, following Abeler et al. (2011) and Koch and Nafziger (2020). Enumerators count the number of zeros in a sequence of tables consisting of randomly ordered zeros and ones. The sequence of tables is randomly generated for each enumerator, and each table appears one at a time in the CAPI instrument. For each table, enumerators enter the number of zeros and can proceed to the next table only after submitting the correct answer. Incorrect submissions are not counted as completed tables; instead, the enumerator remains on the same table and must try again.

The task lasts for a maximum of 90 minutes, the minimum time needed to complete the field surveys. A countdown timer displayed in the CAPI instrument shows the time remaining. After each correctly completed table, enumerators choose whether to continue counting zeros
in the tables or stop. Enumerators earn 15 Birr for each correctly completed table. To reduce potential confounding, including peer effects, enumerators were randomly assigned to rooms, seated apart from one another, and given staggered start times. These procedures limited
the scope for enumerators to observe others’ progress or stopping decisions. We record the session, room, date, start time, and number of enumerators present, and account for session-level variation in the empirical analysis.

Our primary measure of effort is the number of correctly completed tables before the enumerator stops or reaches the time limit. This measure captures correct work rather than attempted work. The relatively long duration of the task also allows us to examine the pattern of effort over time. We construct additional measures of effort and performance using the task data. As alternative measures of effort provision, we use the total time spent working and accumulated earnings up to the point at which the enumerator stops. As a measure of productivity, we use the average time per correctly completed table.

The task is well-suited for measuring effort in this setting. It requires no prior knowledge, measures performance objectively, and has a limited scope for learning. But the task is tedious and repetitive, and completing additional tables involves costly effort for enumerators, as do conventional surveys.

Experimental Design Details

Not available

Randomization Method

The randomization is first done at enumeration or village level to assign enumerators to villages or enumeration areas. The second stage randomization entails random assignment of enumerators assigned for each enumeration area among respondents or a group of respondents. The final unit of randomization varies across surveys.

Randomization Unit

Youth groups in Ethiopia and farmers in Kenya

Was the treatment clustered?

Yes

Experiment Characteristics

Sample size: planned number of clusters

Sample size: planned number of observations

42 enumerators administering survey across 183 youth groups in Ethiopia that host about 2500 youth. Similarly, the Kenyan sample includes 40-45 enumerators administering about 2000 households and survey respondents.

Sample size (or number of clusters) by treatment arms

The random assignment involves random assignment of 40-45 enumerators in each survey to
about 2000-2500 respondents.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

To detect systematic change in (dis)honesty rates in response to the monetary incentive associated with reporting an outcome with the highest reward, we compute a simple power calculation to determine the number of tosses needed. For this purpose, we compiled effect sizes from existing studies (Fischbacher and F¨ollmi-Heusi, 2013; Hanna and Wang, 2017b; Barfort et al., 2019) and adopted a conservative estimate of 0.35. With a power level of 0.8 and an alpha equal to 0.05, the one-sided on-sample required number of rounds is 27, which we rounded up to 30 for convenience.

Supporting Documents and Materials

IRB

Institutional Review Boards (IRBs)

IRB Name

IFPRI IRB

IRB Approval Date

2025-12-29

IRB Approval Number

IRB #00007490

Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information