AEA RCT Registry

Fields Changed

Registration

Field	Before	After
Trial Title	Overestimation of Self-evaluation of One’s Performance	The effect of incentives in a puzzle-solving game
Investigator	Amy Kang	Katherine Silz-Carson
Abstract	Competition tied in with self-evaluation can trigger dishonesty in even the most honest person. In many professional settings, including in some parts of the United States military, subordinates are required to provide significant input about their work performance to their superiors as a part of the performance appraisal process.  In some cases, subordinates are expected to draft their own performance appraisals.  These appraisals may then be used to rank-order subordinates to determine who receives a limited number of rewards, such as a “Definitely Promote” (DP) rating of an officer in the United States military. The research question under investigation in this experiment is whether this form of competition for a limited number of rewards induces individuals to overstate the quality of their performance, something which could be considered a mild form of lying. We are planning to conduct this research through a matrix-solving game through an Amazon program called Amazon Mechanical Turk (MTurk). In the game, participants will be given 20 matrices of 12 numbers and will be asked to select the two numbers in each matrix that add up to 10. The participants will report the number of matrices that they believe that they solved correctly. We hypothesize that under a competition-based compensation scheme, one is more likely to overstate the quality of their performance.	Competition tied in with self-evaluation can trigger dishonesty in even the most honest person. In many professional settings, including in some parts of the United States military, subordinates are required to provide significant input about their work performance to their superiors as a part of the performance appraisal process.  In some cases, subordinates are expected to draft their own performance appraisals.  These appraisals may then be used to rank-order subordinates to determine who receives a limited number of rewards, such as a “Definitely Promote” (DP) rating of an officer in the United States military. The objective of this study is to examine the effects of two alternative incentive systems designed to simulate alternative performance reward systems on stated performance in a matrix-solving game.
Trial Start Date	May 09, 2022	August 08, 2022
Trial End Date	May 23, 2022	December 16, 2022
JEL Code(s)		C91
Last Published	April 28, 2022 06:07 PM	July 13, 2022 04:29 PM
Intervention (Public)	Through this experiment we will be testing how many matrices a participant can solve correctly. They will earn more money if they score in the higher percentile.	The objective of this study is to examine the effects of two alternative incentive systems on stated performance in a matrix-solving game . In this game, participants are given a set of 20 matrices containing 12 numbers each. For each matrix, participants are asked to select the two numbers that sum to 10. After completing all 20 matrices, participants will report on a separate page the total number of matrices that they believe that they solved correctly. Participants will be compensated based on their stated number of matrices solved using one of two possible incentive systems. Participants will be randomly assigned to one of two incentive systems to determine their experimental compensation.
Intervention Start Date	May 09, 2022	August 15, 2022
Intervention End Date	May 16, 2022	September 30, 2022
Primary Outcomes (End Points)	We hypothesize that under a competition-based compensation scheme, one is more likely to overstate the quality of their performance. With this data we can build a test statistic by making an outcome variable “Overstatement.”	Outcome 1: Stated performance in Incentive System #1 Outcome 2: Stated performance in Incentive System #2
Primary Outcomes (Explanation)	Our null hypothesis is that in a competition-based environment, there is no difference in results. The alternative hypothesis is that in a competition-based environment, people are more likely to cheat. If the test statistic found is statistically significant, we reject the null that there is.	Overstatement is the difference between the number of matrices that a participant claims to solve and the number of matrices that they actually solve (observable based on their responses to the 20 matrix problems). Overstatement = Stated number solved - actual number solved
Experimental Design (Public)	To investigate this question, we use the matrix solving game utilized by Rigdon and D’Esterre (2015, “The effects of competition on the nature of cheating behavior,” Southern Economic Journal 81(4), 1012-1024).  The game will be conducted through an Amazon program called Amazon Mechanical Turk (MTurk). In the game, participants will be given 20 matrices of 12 numbers and will be asked to select the two numbers in each matrix that add up to 10.  After completing all 20 matrices, the participants will report the number of matrices that they believe that they solved correctly. In the no competition treatment, participants’ payment amount will be determined by whether they report above the threshold level of 17 matrices correctly solved.  Participants who do better than both thresholds will receive a larger payment than those who do not.  In the treatment condition of competition, not only must subjects outperform the minimum thresholds of 17 correct to earn the higher payment, but they must also score in the top 10% of subjects to earn the higher payment.	The objective of this study is to examine the effects of two alternative incentive systems on stated performance in a matrix-solving game . In this game, participants are given a set of 20 matrices containing 12 numbers each. For each matrix, participants are asked to select the two numbers that sum to 10. After completing all 20 matrices, participants will report on a separate page the total number of matrices that they believe that they solved correctly. Participants will be compensated based on their stated number of matrices solved using one of two possible incentive systems. Participants will be randomly assigned to one of two incentive systems to determine their experimental compensation.
Randomization Method	The game will be conducted through an Amazon program called Amazon Mechanical Turk (MTurk). The randomization occurs whithin the population taking surveys for MRurk. This program allows individuals and companies to “harness the collective intelligence, skills, and insights from a global workforce to streamline business processes, augment data collection and analysis, and accelerate machine learning development” (MTurk Website, https://www.mturk.com/). t is also important to note that these participants must answer questions pertaining to if they have received a high school diploma. This is a baseline that we must implement in the experiment because our data would be inaccurate if the participants have significantly different intellect levels.	Participants will be randomly assigned to one of two incentive systems by a computer.
Randomization Unit	NA	Random assignment occurs at the individual participant level
Planned Number of Clusters	2	N/A - observations in this experimental design are not clustered
Planned Number of Observations	73	150 individual participants - 75 per incentive system
Sample size (or number of clusters) by treatment arms	Sample size= 73 Control group= 36 Experimental group=36	75 individual participants in incentive system #1 75 individual participants in incentive system #2
Power calculation: Minimum Detectable Effect Size for Main Outcomes	power onemean 1 2, sd(3) power(0.8)	A difference of means test of the null hypothesis that the amount of overstatement is the same in Incentive Systems #1 and #2 results in a minimum detectable effect size of 1-1.5 puzzles. The analysis assumes independent random samples and assumes the following parameters, which were drawn from the literature on prior studies that have used the matrix puzzle test. Average overstatement in Incentive System #1: 1 puzzle Standard deviation of overstatement (both Incentive Systems): 3 puzzles Level of significance: 0.05 Power: 0.8.
Keyword(s)	Behavior	Behavior, Labor
Intervention (Hidden)	This experiment not only tested how many matrices a participant could solve correctly, but it also tested to see if there was over exaggeration in self reports on how many they got right based on if they were told they would receive more money or not. We told one group they would earn more money if they self-reported more matrices correctly, while the other one was just told to solve the matrices without any extra incentive. y using a statistical program named Stata, we could use the number collected from the participants and actually see if the participant solved what they said they solved. We plug these numbers into a regression equation that will output whether or not overstatement is popular in the group that was incentivized with more money.	The objective of this study is to examine the effects of two alternative incentive systems on stated performance in the matrix-solving game used by Rigdon and D’Esterre (2015). In this game, participants are given a set of 20 matrices containing 12 numbers each . For each matrix, participants are asked to select the two numbers that sum to 10. After completing all 20 matrices, participants will report on a separate page the total number of matrices that they believe that they solved correctly. Participants will be compensated based on their stated number of matrices solved using one of two possible incentive systems: Incentive System #1: Base rate + Bonus if solve at least 15 matrices Incentive System #2: Base rate + Bonus if solve at least 15 matrices AND are in the top 10% of subjects assigned to this incentive system (including ties) Participants will be randomly assigned to one of the two incentive systems. Incentive System #1 is designed to simulate most standard employee performance appraisal systems, in which employees become eligible for performance awards (e.g. time off awards, cash bonuses, step increases in pay) only if they exceed some minimum standard of performance (e.g. In the Federal employee system, employees must rate at least a 3 on a 5-point scale to be eligible for these awards). Incentive System #2 also requires participants to meet this minimum standard, but adds competition in the form of a tournament. It is designed to simulate the competition that occurs in performance evaluation systems that only reward the top n% of employees, such as the P/DP promotion rating system that exists for members of the military. The null hypothesis to be tested is: H0: The amount of overstatement of performance in Incentive System #1 and Incentive System #2 are the same. We expect to be able to reject this hypothesis. Although our prior is that there will be more overstatement in Incentive system #2, some prior research (e.g. Cadsby, Song, and Tapon (2010)) has found more overclaims in target-based schemes than in tournament schemes. Thus, there is a possibility that there will be more overstatement in Incentive System #1 than in Incentive System #2. For this reason, we use a 2-sided alternative hypothesis when specifying the parameters of the power analysis.
Did you obtain IRB approval for this study?	No	Yes
Secondary Outcomes (End Points)		None
Pi as first author	No	Yes
Public locations	Yes	No

Irbs

Field	Before	After
IRB Name		U.S. Air Force Academy Institutional Review Board
IRB Approval Date		June 15, 2022
IRB Approval Number		FAC20220019E

Other Primary Investigators

Field	Before	After
Affiliation		U.S. Air Force Academy

Field	Before	After
Affiliation		U.S. Air Force Academy

Field	Before	After
Affiliation		U.S. Air Force Academy