Lying about Cheating

Last registered on August 14, 2015

Pre-Trial

Trial Information

General Information

Title
Lying about Cheating
RCT ID
AEARCTR-0000808
Initial registration date
August 11, 2015

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
August 11, 2015, 3:51 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
August 14, 2015, 5:09 PM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
ITAM

Other Primary Investigator(s)

Additional Trial Information

Status
In development
Start date
2015-08-11
End date
2016-06-30
Secondary IDs
Abstract
Corruption, fraud, clientelism, and other normatively-unacceptable behaviors are of great practical and social-scientific interest, but they are difficult to study because they are difficult to measure. Survey respondents, in particular, are likely to underreport having partaken in, or witnessed, such behaviors. This study uses an experimental design to assess the biases of different methods of surveying people about their cheating behavior.
External Link(s)

Registration Citation

Citation
Simpser, Alberto. 2015. "Lying about Cheating." AEA RCT Registry. August 14. https://doi.org/10.1257/rct.808-2.0
Former Citation
Simpser, Alberto. 2015. "Lying about Cheating." AEA RCT Registry. August 14. https://www.socialscienceregistry.org/trials/808/history/4990
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
The interventions are described in the hidden part of the registry.
Intervention Start Date
2015-08-14
Intervention End Date
2015-08-28

Primary Outcomes

Primary Outcomes (end points)
Fraction of subjects who truthfully report having cheated is the main outcome variable.

Secondary outcome variables include responses to the set of questions included at the end of the survey about corruption, bribery, and expectations.
Primary Outcomes (explanation)
The average treatment effect of interest is the difference between the aggregate response bias in two experimental conditions.

The fraction of subjects who actually cheat in an experimental condition is the difference between the theoretical distribution of the outcome of the draw, and the self-reported distribution in that experimental condition. (Randomization implies that this should be roughly equal across experimental conditions).

The aggregate response bias in an experimental condition is the difference between the fraction who admit to having cheated, and the fraction of subjects who actually cheated.

These variables are described more precisely in the attached PDF document.

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The experimental design is described in the hidden part of the registration for this study.
Experimental Design Details
In this study I propose an experimental design to measure the causal effect of different survey question techniques on reporting bias for a sensitive question concerning cheating. This research design makes it possible to estimate the true aggregate level of cheating – the key quantity that has remained elusive in most previous studies. In the core part of the study, subjects are privately exposed to a draw from a known distribution and they are remunerated in proportion to the value of the draw that they report having obtained (thereby providing an incentive to exaggerate). They are then asked whether they reported their draw accurately or exaggerated. In each experimental condition, subjects are asked this question using a different survey technique: direct questioning, indirect questioning, RRT, and ICT. This makes it possible to estimate the average causal effect of each survey technique on response bias by comparing self-reported cheating with directly-estimated cheating, for each of the experimental conditions.

Laboratory experiments are often criticized on the grounds that experimenter demand effects (EDE) could distort subject behavior. Even when EDE distorts behavior, this provides cause for concern only when the distortion could in principle spuriously produce the key results -- distortions orthogonal to the behavior of interest could be safely ignored (Zizzo 2010). There is good reason to expect that EDE are not a cause of concern in the present study. First, it is possible that EDE could influence incentives to cheat (i.e., to misreport one's draw in order to receive a higher payment). However, the rate of cheating is not the estimand of interest. Moreover, any such distortion would be identical in all the experimental conditions. Second, it is not easy to see how EDE might distort the estimand of interest -- i.e., the difference in the average proportion who report having cheated and the estimated rate of cheating, in each experimental condition. One would have to argue, among other things, that different modes of asking (each associated with an experimental condition) are subject to different degrees of EDE. While I see no evident manner in which this might be the case, I nevertheless take several measures to mitigate potential EDE.

First, non-deceptive obfuscation is implemented by adding a filler section. The filler section makes it very difficult for the subject to figure out exactly what the researcher is after, since both the filler and core sections are plausibly of interest in the context of the study. Second, non-deceptive obfuscation is utilized within the core section of the study, in order to further decrease the possibility that subjects could figure out what the experimenter is after (or even what the central topic of the study is). Third, incentivized questions are utilized in both the core and the filler sections of the study, so that differences in incentives are not a clear indicator of researcher interest. Finally, to avoid EDE stemming from the physical presence of a researcher in the room (e.g., via eye contact, or tone of voice), the study is conducted entirely online.

More concretely, the first section of the study will focus on the use of small change -- a topic of salient historical and contemporary relevance due a traditional scarcity of small-denomination currency in Mexico. This section includes an incentivized question with the structure of a “beauty contest” (asking subjects which coin denomination they believe is most frequent among participants in the study). The second section will ask the subject to toss a coin several times and report the number of heads, providing incentives for subjects to exaggerate. The subject will then be asked whether she misreported. The manner of asking is the treatment variable, and it will vary across the experimental conditions. The four conditions are: direct questioning, indirect questioning, ICT, and RRT. This is the core section. The filler section provides a plausible alternative reason for asking subjects to have a coin at hand. The core section will additionally contain filler material emphasizing the topic of informal decision-making devices (such as the coin-toss or rock-paper-scissors), to provide a plausible alternative reason for asking subjects to toss the coin.

Further details are provided in the attached document and links.
Randomization Method
Randomization by computer.
Randomization Unit
Individuals.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
I cannot know what the response rate will be. I expect to have 200-300 subjects total.
Sample size: planned number of observations
Design is not clustered; see above.
Sample size (or number of clusters) by treatment arms
Treatment conditions 1, 2 and 4 each have a 24% probability of arising; treatment condition 3 has a 28% probability.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Since this pre-registration text precedes the running of the pilot and of the full study, I rely on the literature to obtain a rough sense of the adequate sample size. I focus mainly on treatment effects taking direct questioning (condition 1) as the control condition, and each of the other three conditions as separate treatments. I focus on two studies where the sensitive item concerns some form of cheating, and on a general meta-analysis: • Ocantos et al (2012) study vote buying using the ICT. They find that only 2.4% (s.e.=0.6%) of subjects reported receiving an individual gift in echange for their vote during a 2008 election in Nicaragua when asked directly, but 24% (s.e.=5.5%) reported having received such a gift when asked through the ICT. Assuming equal variances for the treatment and control groups, and supposing the variance is numerically equal to that estimated for the treatment group (a conservative choice in this case), power is close to 1.0 even for a sample size as small as 30 (since the effect size is so large). (Note: the formula I'm using, in Latex code, is: \beta=\Phi(\left[|\mu_{t}-\mu_{c}|\sqrt{N}\right]/2\sigma-\Phi^{-1}(1-\alpha/2)), where \beta denotes power, \Phi is the cumulative Normal distribution, \Phi^{-1} is the inverse of such distribution, \sigma is the standard deviation of the outcome for both the treatment and the control groups, N is the sample size, and \alpha is the level of statistical significance (Gerber and Green 2012)). • Van der Heijden et al (2000) compare RRT with direct questioning in both a face-to-face survey and a computer-assisted survey. The proportion of subjects known to have engaged in income fraud who admitted to it was 43% (s.e.=6.8%), 25%(s.e.=4.4%), and 19%(5.8%) respectively for RRT, face-to-face direct questioning, and computer direct questioning. The effect size here is also so large that even 30 subjects suffices to attain power close to 1.0 (assuming, for example, that the standard error of the outcomes is 6.8%, and choosing 43%-25%=18% as the magnitude of the treatment effect). • In a meta-analysis of RRT, Lensvelt-Mulders (2005) find that the mean percent underestimation of a sensitive item using RRT is 38% (s.e.=.099), while it is 42% (s.e.=.099) in face-to-face interviews, 46% (s.e.=.138) in phone interviews, 47% (s.e.=.14) in self-administered questionnaires, and 62% (s.e.=.191) in computer-assisted surveys. Comparing the rate of reporting of the sensitive behavior under RRT with that under self-administered questionnaires, and taking the standard error to be 0.14, a sample size of 75 is necessary to attain power of 0.8. In sum, while I face considerable uncertainty about effect sizes and variances before running the study, an N of 50 to 70 per treatment condition seems reasonable. It is not clear whether a small pilot study (with 15-30 respondents) would suffice to reduce this uncertainty in a meaningful way. To further improve power, to improve covariate balance, and to reduce the variability, I will estimate treatment effects adding pre-treatment covariates as control variables, and (alternatively) I will implement sequential blocking on pre-treatment covariates (after having collected the data, but without utilizing outcome data for blocking; see Moore and Moore 2011).
Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials