Agentic Feedback and Quality of Field Survey Data

Last registered on November 13, 2023

Pre-Trial

Trial Information

General Information

Title
Agentic Feedback and Quality of Field Survey Data
RCT ID
AEARCTR-0011962
Initial registration date
October 31, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
November 08, 2023, 11:27 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
November 13, 2023, 1:16 AM EST

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
Institute of Rural Management Anand

Other Primary Investigator(s)

PI Affiliation
Verghese Kurien Policy Lab (IRMA)

Additional Trial Information

Status
Completed
Start date
2023-08-05
End date
2023-11-10
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
In the field of applied research, ensuring data quality has emerged as a significant concern, as evidenced by a substantial body of literature. Over the past two decades, there have been notable advancements in data collection procedures. This includes a transition from traditional paper-based data collection to tablet-based CAPI methods, as well as improvements in telephonic data collection during the COVID-19 pandemic. Despite several innovations on the technology side of the data collection process, field enumerator-level initiatives are quite few. Guidelines from most think-tanks and research labs focus on survey training and survey manuals. Some researchers recommend conducting tests at the end of the survey training, to enhance awareness of field enumerators.

The field experiment is mounted on a household survey that will be collected as part of a baseline survey of a school milk program in Gujarat, India. The state has historically experienced higher incidence of stunting and malnutrition among children and continues to do so. Therefore, the program seeks to provide liquid milk in government schools in order to improve child nutrition status. The experiment is therefore an attempt to generate causal evidence on how to improve the quality of anthropometric and other sensitive data at the individual and household level. In addition to standard household modules (including roster, assets, income, etc.), the study survey captures information on child anthropometry, child consumption, mother's well-being and her activity status.

The experiment consists of three treatment arms that randomly allocates 60 field enumerators using block-design across 120 villages: (i) enumerators in the control group villages receive a standard survey training with no tests (C), (ii) enumerators in the first treatment group solve three tests at the end of each day of the training and receive non-agentic feedbacks (T1), and (iii) enumerators in the second treatment group solve an identical tests as T1 but in addition they receive agentic feedbacks with an option for resubmission based on feedbacks (T2). The field experiment can examine the effect of frequent testing during survey trainings (T1 versus C) and variations in feedbacks on quality of survey data (T1 versus T2 and T2 versus C).

The experimental design facilitates the estimation of the test effect and agentic feedback on the quality of survey data. Quality of survey data is assessed from high-frequency checks that are completed on each day of the field work. The potential channel for generic testing includes awareness while agentic feedbacks may trigger mechanisms such as motivation, self-confidence, and self-responsibility.



External Link(s)

Registration Citation

Citation
Elangbam , Kyamba Khanganba and Vivek Pandey. 2023. "Agentic Feedback and Quality of Field Survey Data." AEA RCT Registry. November 13. https://doi.org/10.1257/rct.11962-1.2
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
The experiment consists of three treatment arms that randomly allocates 60 field enumerators using block-design across 120 villages: (i) enumerators in the control group villages receive a standard survey training with no tests (C), (ii) enumerators in the first treatment group solve three tests at the end of each day of the training and receive non-agentic feedbacks (T1), and (iii) enumerators in the second treatment group solve an identical tests as T1 but in addition they receive agentic feedbacks with an option for resubmission based on feedbacks (T2). The field experiment can examine the effect of frequent testing during survey trainings (T1 versus C) and variations in feedbacks on quality of survey data (T1 versus T2 and T2 versus C).
Intervention Start Date
2023-09-02
Intervention End Date
2023-10-31

Primary Outcomes

Primary Outcomes (end points)
The primary outcomes include: (i) number of valid skips, (ii) missing values, (iii) number of flags, and (iv) distribution of key survey variables such as weight, height, and psychological well-being
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The experimental design is a combination of cluster and block design ,i.e., we have utilized blocking on clusters. There are four blocks that combine high and low cognitive scores with high (H) and low (L) work experience in field work of enumerators. The size of the enumerator group is 60 and serve as clusters in the experiment. Blocking on these clusters will randomly allocate 15 enumerators in each of the four blocks (i.e., HH, HL, LH, and LL). Given that each block is comprised of three experimental arms (i.e., C, T1, T2), there will be 20 enumerators allocated to each experimental arm across the four blocks. The enumerators are allocated to 120 villages for the conduct of the field surveys in such a manner that each village is allocated to only one of the three treatment arms (C or T1 or T2).
Experimental Design Details
Randomization Method
Randomization done in office by a computer after the recruitment of field enumerators was completed
Randomization Unit
Blocking on clusters where each strata are based on enumerator experience and cognitive score of the enumerators. Cognitive scores were measured at the time of recruitment process of enumerators.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
120 villages and 60 enumerators
Sample size: planned number of observations
The unit of data collection is at the household-level, i.e., 13 household surveys per village * 40 villages = 520 households per experimental arm, with a total of 1560 HHs across the three experimental arms. Therefore the number of observations is 1560 households. The anthropometric schedule will be canvassed to children in such households. Given that a number of outcomes will be reported at the child level, we expect to generate more than 3500 observations at the child-level (we assume 2.3 children per household).
Sample size (or number of clusters) by treatment arms
40 villages and 20 enumerators (control-C), 40 villages and 20 enumerators (non-agentic feedback-T1), and 40 villages and 20 enumerators (agentic feedback-T2)
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Cluster size per enumerator is 26 households (i.e., field surveys). The unit of data collection is at the household-level (i.e., 13 HH surveys * 40 villages = 520 households per experimental arm, with a total of 1560 HHs across three experimental arms). For power calculation, we have calculated MDE for only two groups: control group and one treatment group. This is because the MDE for the second treatment arm would be the same while performing power calculations ex-ante. The MDE calculation also accounts for unequal cluster size. For this, the coefficient of variation is obtained from the 2019 DSY field survey data which also has data with unique enumerator IDs. Below table shows MDE for varying ICC and the case of equal and unequal cluster size (coefficient of variation). We used NRLM data from state of Maharashtra which neighbors the state of Gujarat. Flags are identified from 7 variables based on its extreme values (flagged if the value is more than 2 standard deviations away from the mean). These variables include income from 6 sources (enterprise income, fishery income, livestock income, agriculture income, salary income, and income from other sources) and another variable which captures total number of days worked under MGNREGA. Outcome variable is then the total number of flags generated per survey out of these 7 variables. The MDE range from -0.11 to -0.13 standard deviation (the negative sign represents reduction in survey errors)
IRB

Institutional Review Boards (IRBs)

IRB Name
Institute of Rural Management Anand
IRB Approval Date
2023-06-22
IRB Approval Number
IRMA/REC/006-22-06-2023

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials