Don't judge a paper by its cover

Last registered on November 26, 2023


Trial Information

General Information

Don't judge a paper by its cover
Initial registration date
October 31, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
November 08, 2023, 11:25 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
November 26, 2023, 4:34 PM EST

Last updated is the most recent time when changes to the trial's registration were published.


Primary Investigator

University of Torino and Collegio Carlo Alberto

Other Primary Investigator(s)

PI Affiliation
University of Turin, Collegio Carlo Alberto and University of Amsterdam
PI Affiliation
University of Turin and Collegio Carlo Alberto
PI Affiliation
University of Turin, Collegio Carlo Alberto and Université Paris 1 Panthéon-Sorbonne

Additional Trial Information

In development
Start date
End date
Secondary IDs
Prior work
This trial is based on or builds upon one or more prior RCTs.
We study the extent author’s affiliation can bias reviewers in grading papers submitted to international conferences. We exploit a PhD Workshop in Economics arranged by local PhD candidates. Our objective is to investigate the extent to which author affiliation can introduce bias in the evaluation of research. Affiliation is often used as a signal to form beliefs about research quality (Blank 1991). We test whether signals can potentially be misinterpreted and exacerbate inequalities across researchers from differently ranked institutions.

We define affiliation bias as any difference between blind and non-blind assessments for each submission. This conference welcomes extended abstracts only. We remove from all submissions information on title, authorship, acknowledgments, and indications of preliminary work. For each submission we prepare a blind and non-blind version. Abstracts prepared for blind grading also have authors’ affiliations removed. Submissions prepared for non-blind grading keep the affiliation of the submitter visible. Every submission gets read by at least one blind and one non-blind reviewer. Grading is based on peer-reviews, with applicants to the conference rating submissions and providing grades. We will randomly assign reviewers to either the blind or non-blind assessment groups. The assignment of reviewers and submissions will follow a multiple stage randomization procedure.
External Link(s)

Registration Citation

Carreras, Enrique et al. 2023. "Don't judge a paper by its cover." AEA RCT Registry. November 26.
Experimental Details


We randomly allocate reviewers to either blind or non-blind grading. All reviewers from the same institution will be assigned to the same treatment arm to avoid contamination. In addition, they will not be informed of their treatment allocation nor about being involved in an experiment studying affiliation bias. Reviewers from each arm will receive a block of extended abstracts to evaluate and will be paired across treatment arms, meaning that a block assigned to a blind reviewer will also be evaluated by a corresponding non-blind reviewer.
We remove from all submissions information on title, authorship, acknowledgments, and indications of preliminary work. Abstracts prepared for blind grading also have authors’ affiliation(s) removed, while submissions edited for non-blind grading keep the affiliation of the submitter visible. Affiliation(s) will be positioned on top of the title page. In case of multiple affiliations we retain all affiliations of the submitter.
We will not disclose to reviewers that an experiment is taking place before submitting grades. Reviewers will find out about being part of an experiment in the last phase of the data collection tool. Reviewers will be instructed not to share any information on grading. Abstracts are distributed to reviewers through a dedicated shared folder that is only accessible to them. Blind and non-blind extended abstracts will be renamed following separate naming conventions to increase coordination costs for reviewers across different institutions. Reviewers from the same institution share the same treatment arm but grade a different block of papers. Both blind and non-blind versions will be made available as non-searchable PDF documents to increase the cost of searching for the missing information online.
Reviewers are asked to assess submissions along three grading criteria: research question, writing style and research design. All grades range from 1 to 10. Reviewers will also be asked for a recommendation on acceptance. For each submission, we will also include an optional open-ended box for reviewers to share a comment on the abstract, explaining their grade.
The same data collection tool will include questions on beliefs about how the work of the average person from the reviewer’s own PhD program compares to the submission they are evaluating. Finally, they will also be asked about their interest in meeting the author.
Subsequently to collecting grades, we will ask reviewers a number of questions to assess any threats to validity. We will inquire reviewers about their perceived purpose of this grading scheme. Next, we will ask reviewers to rank universities into separate tiers based on their perceived quality of institutions. We will augment this list of universities with fillers and reviewers’ own institutions. We will also collect information on the perceived importance of title, name, affiliation, and acknowledgments in evaluating papers for themselves and the economics profession. Finally, we will collect basic socio-demographic questions and measures of sociability.
Reviewers will receive a grading package up to 3 days after the submission deadline to allow for submission allocation and anonymization. The grading package consists of instructions for grading and either blind or not blind abstract versions based on a reviewer-randomized assignment. Reviewers are instructed not to share or talk to each other about the grading package they receive. Reviewers have up to 14 days to grade 8 extended abstracts each. Grades are sent back to the conference organizers via a dedicated form. In order to submit the form, they will have to answer additional questions on beliefs about relative quality, university rankings, perceived importance of different signals and socio-demographic questions.
Intervention Start Date
Intervention End Date

Primary Outcomes

Primary Outcomes (end points)
- Grade A
- Grade B
- Grade C
- Recommendation on acceptance
Primary Outcomes (explanation)
- Grade A is assigned to a submission based on originality of research question and contribution to the literature.
- Grade B assigned to a submission based on writing clarity and readability.
- Grade C assigned to a submission based on the soundness of the research design (empirics or theory)
All grades range from 1 to 10.
- Recommendation for acceptance. Reviewers are to choose one of the following options: (A) Definitely accept: very good paper. (B) Probably accept: good paper. (C) Might accept: OK paper. (D) Don’t think this paper can be accepted.

Secondary Outcomes

Secondary Outcomes (end points)
- Beliefs about quality of work from own institution compared to submitted abstracts
- Interest to meet the speaker
- Ranking of institutions according to tiers
- Comments from open-ended field
Secondary Outcomes (explanation)
- Beliefs about quality of average work from own institution compared to submissions. We want to elicit whether reviewers perceived work from their own institution to be of the same or of higher/ lower quality of the submitted abstract. We believe this might be influenced by affiliation bias.
- Interest to meet the speaker. Reviewers will be asked whether they are interested in meeting the speaker. We will use this question to explore networking as potential mechanism.
- Ranking of institutions according to tiers. We will ask reviewers to assign a quartile to each establishment out of the list of applicants’ institutions. We will use this information as a subjective measure for affiliation quality. We suspect it might differ from more established rankings for the population of early career PhD researchers. This measure of perceived quality might also be more easily swayed by high quality applicants originating from less recognized institutions.
- Comments from the open-ended field. We will explore any qualitative feedback that reviewers might want to share. We will analyze any difference in the language used to refer to submissions from the different universities.

Experimental Design

Experimental Design
We compare reviews for the same extended abstract by blind and non-blind reviewers. We will use the whole body of submissions to a PhD Conference in Economics organized in Paris. Submissions are open to PhD students and early post-docs from any university.

Applicants to the conference will grade submissions as peer reviewers. We start from the list of applicants’ affiliations and stratify them into two tiers based on subject specific university rankings. For each tier, we randomize the affiliations of the reviewers into either blind or non blind grading. This methodology ensures that reviewers from the same institution are assigned to the same treatment arm.

We match blind and non blind reviewers from different institutions to make pairs. Pairs are formed such that reviewers from the top tier are matched with reviewers from the second tier, and remaining reviewers from the second tier are paired with someone from the same tier. This pairing methodology allows us to maximize the number of couples of each type and explore potential heterogeneity across couples.

We randomize papers to blocks and randomly assign them to couples. We will create as many blocks as the total number of reviewer pairs. Every block will contain 8 submissions, randomly chosen. For each couple, we only randomize blocks that do not contain submissions from the institutions where reviewers are affiliated with. This ensures that reviewers will not receive papers from their own institution. Reviewers submit their grades within 14 days after receiving the grading package.
Experimental Design Details
We compare reviews for the same abstract by blind and non-blind reviewers. We will use the whole body of submissions to a PhD Conference in Economics organized in Paris. Submissions are open to PhD students and early post-docs from any university.
Randomization Method
Randomization is done in the office by a computer for all stages.
The first stage randomizes affiliations to blind or non-blind grading.
The second stage randomizes papers to blocks.
The third stage randomizes blocks of papers to pairs of reviewers.
The randomization takes place on 06/11/2023 upon the closing of the call for papers.
Randomization Unit
Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters
100 external reviewers (expected)
Sample size: planned number of observations
800 grades for 100 submissions (expected)
Sample size (or number of clusters) by treatment arms
Control group: 400 grades for 100 submissions (expected)
Treated group: 400 grades for 100 submissions (expected)
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
- Grade A: 0.359405526 - Grade B: 0.369812832 - Grade C: 0.29574825 - Probability of definitely/ probably accept a paper: 0.104263765

Institutional Review Boards (IRBs)

IRB Name
ETHICS COMMITTEE of Collegio Carlo Alberto
IRB Approval Date
IRB Approval Number


Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information


Is the intervention completed?
Data Collection Complete
Data Publication

Data Publication

Is public data available?

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials