Reciprocity in Peer Assessments

Last registered on May 21, 2026

View Trial History

Pre-Trial

Trial Information

General Information

Title

Reciprocity in Peer Assessments

RCT ID

AEARCTR-0018376

Initial registration date

April 22, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

April 29, 2026, 3:32 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated

May 21, 2026, 6:51 AM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Country

Cyprus

Region

Primary Investigator

Name

Lunzheng Li

Affiliation

Xiangtan University

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Philippos Louis

PI Affiliation

University of Cyprus

Contact Investigator

PI Name

Zacharias Maniadis

PI Affiliation

University of Southampton

Contact Investigator

PI Name

Dimitrios Xefteris

PI Affiliation

University of Cyprus

Contact Investigator

Additional Trial Information

Status

In development

Start date

2026-04-23

End date

2026-12-31

Keywords

Behavior

Additional Keywords

Reciprocity, Peer assesment, Lying, Lab experiment

JEL code(s)

C7, C9, D9

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

Peer assessment is widely used in academic settings, workplace evaluations, and collaborative contexts as a scalable alternative to expert grading. A well-documented concern is that grades assigned by peers may be driven not only by the objective quality of the work being evaluated, but also by strategic and social considerations — most notably, reciprocity.
This study examines two related phenomena: (i) whether evaluators who expect their own grade to be influenced by the grade they assign (i.e., sequential first movers in a dyadic grading exchange) inflate their assessments in anticipation of reciprocal reward, and (ii) whether evaluators who have already received a grade adjust their own assessment in response to the surprise component of the grade they received.
We exploit a controlled laboratory design in which participants first complete an analytical task, then grade each other's response. In the sequential condition, the second grader (the responder) observes the grade they received before assigning a grade in return, creating an identifiable window for reciprocal adjustment. In the simultaneous condition, both grades are submitted without knowledge of the other's assessment. Comparing grading behaviour across these two conditions allows us to isolate strategic anticipation and reciprocal adjustment from other determinants of peer grading.

External Link(s)

Registration Citation

Citation

Li, Lunzheng et al. 2026. "Reciprocity in Peer Assessments." AEA RCT Registry. May 21. https://doi.org/10.1257/rct.18376-1.1

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

Intervention Start Date

2026-04-23

Intervention End Date

2026-12-31

Primary Outcomes

Primary Outcomes (end points)

The grade assigned to one's peer for the response they submitted for the main task.

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

Secondary Outcomes (explanation)

Experimental Design

**Participants and Setting
Participants will be recruited from a university subject pool and will participate in sessions of approximately 30 to 45 minutes. Participants are randomly assigned to either the simultaneous condition (control) or the sequential condition (treatment). The unit of randomisation is the individual; within each session, participants are randomly paired into dyads. Informed consent will be obtained from all participants prior to the experiment.
**CRT
Participants are first asked to take a Cognitive Reflection Task (CRT), consisting of three mutiple choice questions. We use the three-item CRT MCQ-4 from Sirota & Juanchich (2018), keeping the order of items fixed across participants (bat & ball, widgets, lillys) and randomizing the order of answer options in each item for each participant. After responding to the CRT, participants move on to completing the main task.
**The Main Task
Each participant acts as a hospital administrator and is presented with a table of patient recovery times across nine hospital units over six months. Some units implemented a new staff training programme in March; others continued with the old procedures throughout the period. Participants are asked to evaluate, based on the data, whether the training programme improved health outcomes and to write a short response (maximum 500 characters). They are given 10 minutes to submit their response.
The correct conclusion is that the training had no detectable effect: recovery times fell across all nine units — including the four that never received training — suggesting a confounding time trend rather than a genuine treatment effect. Reaching this conclusion requires comparing trained and untrained units rather than looking only at within-unit trends.
After submitting their response, participants report their own assessment of the training programme's effectiveness on a seven-point scale from −3 (strongly harmful) to +3 (strongly beneficial), with 0 indicating no effect. They also record their expected grade (the grade they expect to receive from their partner) on a 1-10 scale. These measures are note reported to their partner.
**Grading Procedure
After completing the task, each participant grades their partner's response on a 1–10 scale. In the sequential condition, one participant in each pair is randomly designated the first mover and submits their grade first. The responder then observes their received grade before assigning their own. The responder's window for adjustment is thus identified by the timing of information revelation. In the simultaneous condition, both participants submit grades without knowledge of the other's assessment.
After submitting a grade, we ask participants to provide some feedback regarding the grading criteria they applied. These are not going to be used for analysis and serve only for potentially improving the design implementation in future studies.
**Survey
After grading, participants complete a short post-task survey including demographic questions (age, gender, study major and year of study, major) and validated measures of subjective numeracy (SNS-3 from MCNaughton et al., 2015).

Experimental Design Details

Not available

Randomization Method

Participants are randomly assigned to condition at the session level, with equal allocation targeted across conditions. Within sessions, dyad formation and first-mover designation are determined by random draw at the time of pairing.

Randomization Unit

Individual participant

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

Sample size: planned number of observations

180 particpants

Sample size (or number of clusters) by treatment arms

60 (30 dyads) in the simultaneous condition and 120 (60 dyads) in the sequential condition, out of which 60 will be first-movers and 60 will be second-movers.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

IRB

Institutional Review Boards (IRBs)

IRB Name

Cyprus National Bioethics Committee

IRB Approval Date

2020-07-24

IRB Approval Number

ΕΕΒΚ ΕΠ 2020.01.166

Analysis Plan