Experimental Design Details
We have developed a formal model in which a supervisor and an agent both receive signals on the agent’s performance. The agent then reports a self-assessment to the supervisor and the supervisor updates her beliefs on the agent’s performance and makes the performance assessment. The supervisor is paid based on the accuracy of her rating (as measured by the squared difference between true performance and rating). Agents can receive a bonus, which depends on the supervisor’s rating. Agents trade off potential monetary gains against disutility from dishonesty in self-assessments and supervisors have incomplete information on the relative weights each agent assigns to both objectives. We show in the model that (i) supervisors’ ratings are increasing in agents’ self-assessments, (ii) less honest agents inflate their self- assessments to a stronger extent and thus receive higher bonuses, but (iii) the use of self-assessments still leads to a higher overall expected accuracy of ratings.
We test the predictions of the model conducting an experiment on CloudResearch Connect. Subjects are participating in the experiment as either agents (called workers in the experiment) or supervisors (called raters in the experiment). In a first stage, subjects in the role of agents perform a real effort entry task. The task consists of entering text contained in hard-to-read images (similar to so-called ‘captchas’). Agents see 10 pages with 10 images on each page. Each page has one of five different time limits: 17, 19, 21, 23, or 25 seconds. Each of these time limits occurs exactly twice in randomized order. The order is the same for all agents. The time limit for the next page is announced during a 5-seconds countdown before the page starts. The sum of correctly entered words out of the 100 words across all 10 pages constitutes an agent’s true performance. After the entry task, agents review all their entries across the 10 pages and are then asked to submit a self-assessment about the number (out of 100) of correctly entered words. Agents also see the distribution of performance from agents from another experiment with the same real-effort task (the same data that was shown to participants in Kusterer/Sliwka, forthcoming). They are informed whether this self-assessment will be revealed to the supervisor.
In a second stage, agents’ performance is rated by subjects in the role of supervisors. For each rated agent the supervisor sees the number of correctly entered words on one randomly selected screen. Each supervisor rates five agents and one of these ratings is payoff relevant for the supervisor and the respective agent. The supervisors also see the distribution of performance that was shown to the agents on their self-assessment page. The ratings are given on a scale of 0-100, and supervisors are told that they should provide a “rating for the number of correctly entered words out of the 100 displayed words across all 10 pages for this worker”. In all treatments, a supervisor’s payment depends on the accuracy of the rating, i.e. the squared deviation between the agents’ performance on all 10 pages of the Entry Task and the supervisor’s assessment of this performance.
We elicit agents’ preferences for honesty with different survey measures (Grosch et al., 2020; Necker and Paetzel; 2023 ; Schudy et al., 2024).
We run a 2x2 experimental design varying whether or not
(a) The agent’s self-assessment is revealed to the supervisor after having performed the data entry task and
(b) the agent receives a bonus based on the rating provided by the supervisor.
Procedure of the experiment:
Agents are randomly assigned to the role (agent or supervisor) and to the treatment. The identity of agents is never revealed to other participants or to the experimenter. Payment is assigned based on the Connect ID, which the experimenter cannot link to the identity of participants. Before the experiment starts, participants provide informed consent. Specifically, participants are informed that they have the right to withdraw from the study at any time.
Participants are also informed that they have to perform the task within 60 minutes and to correctly answer a set of comprehension questions in order to be eligible for payment.
Supervisors receive a fixed wage of $0.80 plus a bonus of min{$2.50-$〖0.001*(Rating-Performance)〗^2,0}
and agents receive a fixed wage of $0.80 + $1.7 and in the treatments with a bonus this bonus is $1*Rating*0.01.
The treatments are summarized below. The number n in each cell reflects the number of agents. There is an equal number of supervisors in each cell.
1. No self-assessment, no bonus, n ≈1000
2. Self-assessment, no bonus, n ≈1000
3. No self-assessment, bonus, n ≈1000
4. Self-assessment, bonus, n ≈1000
Exclusion criteria
On CloudResearch Connect, we exclude participants with less than 95% approval rate. Agents are excluded from payment (and further participation in the study) if they do not enter the text from a single image and are informed about this. For the analysis, we also drop all observations from supervisors who spent less than 10 seconds in total on all 5 evaluation screens.
All participants must enter comprehension questions to make sure they understand the instructions. If they do not answer a question correctly after the second attempt, they are excluded from further participation.
References:
Grosch, K., Müller, S., Rau, H., Zhurakhovska, L. (2020). Measuring (social) preferences with simple and short questionnaires. mimeo.
Kusterer, D. J., & Sliwka, D. (forthcoming). Social Preferences and the Informativeness of Subjective Performance Evaluations. Management Science.
Necker, S., & Paetzel, F. (2023). The effect of losing and winning on cheating and effort in repeated competitions. Journal of Economic Psychology, 98, 102655.
Schudy, S., Grundmann, S., & Spantig, L. (2024). Individual Preferences for Truth-Telling. CESifo Working Papers, 2024(11521).