Going Beyond the Mean: How Gender Affects the Distribution of Evaluations and How the Distribution of Evaluations Affects Hiring Decisions

Last registered on August 10, 2023

Pre-Trial

Trial Information

General Information

Title
Going Beyond the Mean: How Gender Affects the Distribution of Evaluations and How the Distribution of Evaluations Affects Hiring Decisions
RCT ID
AEARCTR-0011877
Initial registration date
August 02, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
August 10, 2023, 12:59 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Monash University

Other Primary Investigator(s)

PI Affiliation
Monash University
PI Affiliation
University of Gothenburg

Additional Trial Information

Status
In development
Start date
2023-08-15
End date
2024-08-15
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
In this project, we investigate the impact of various distributional metrics of job applicants on the evaluation of applicants and their likelihood of being selected for hiring. We also study whether these distributional metrics have similar effects depending on the gender of the applicants and whether the gender of the applicants is known.
External Link(s)

Registration Citation

Citation
Avery, Mallory, Andreas Leibbrandt and Joseph Vecci. 2023. "Going Beyond the Mean: How Gender Affects the Distribution of Evaluations and How the Distribution of Evaluations Affects Hiring Decisions." AEA RCT Registry. August 10. https://doi.org/10.1257/rct.11877-1.0
Experimental Details

Interventions

Intervention(s)
In this project, we conduct multiple interventions to investigate the impact of various distributional metrics of job applicants on the evaluation of applicants and their likelihood of being selected for hiring. We also study whether these distributional metrics have similar effects depending on the gender of the applicants and whether the gender of the applicants is known.
Intervention Start Date
2023-08-15
Intervention End Date
2024-08-15

Primary Outcomes

Primary Outcomes (end points)
We collect the following primary outcomes:
- Choose to hire: this is defined as the applicant, out of a pair, that is chosen by the evaluator to be recommended for hiring by the evaluator
Primary Outcomes (explanation)
In an earlier survey study, we found evidence that larger means and lower variance and minimums, relative to the other applicant, led to increased chance of being hired. We thus hypothesize that these patterns will follow in the experiment:

Hypothesis 1a: When mean differences are small metrics other than the mean (i.e. variance, range, maximum, minimum, outliers, skew) will have predictive power when determining which out of a pair of applicants is chosen to be hired.

Hypothesis 1b: When mean differences are small an applicant having a higher mean or maximum will increase their likelihood of being hired, while a higher variance will decrease their likelihood of being hired.

Small mean differences are defined based off of the three evaluation sets per pair that generated the smallest mean difference while retaining trade-offs between the two applicants, i.e. that one applicant did not strictly or weakly dominate the other applicant in terms of evaluations. The largest mean difference in our sample is 5.33 out of a possible range of 0-100. Because we focus on this case of smaller mean differences, we acknowledge that our results may not be generalizable to cases where the difference in means of the evaluations are more substantial.

Furthermore, based on this earlier evidence, we hypothesize the following treatment interactions:

Hypothesis 2: The predictive power of the evaluation metrics will be diminished when gender is known, compared to when gender is not known.

Hypothesis 3: When gender is known conditional on mean, we expect that metrics will have a less positive (or more negative) impact on hiring decisions for women than for men.

Hypothesis 4: The benefit (cost) from having a higher (lower) mean and a lower (higher) variance will be amplified for men (women) in mixed-gender pairs relative to when gender is not known.

Secondary Outcomes

Secondary Outcomes (end points)
To understand possible mechanisms we elicit the following secondary outcome variables:
-Time spent: as a proxy for attention, we will measure how long evaluators spend on each pair. Greater time spent will be related to greater attention spent on making the right decision.
-Gender: we will measure the likelihood that the chosen individual is female given whether gender is known and the gender composition of the pair
-Quality: we will measure the relative quality of the applicant that is chosen, as measured by their qualifications, average evaluation scores from all evaluations given, and the AI-generated scores assigned to each applicant.
Secondary Outcomes (explanation)
See above

Experimental Design

Experimental Design
Our design aims to measure the impact of the distribution of applicants’ evaluations when making hiring decisions.
Experimental Design Details
Not available
Randomization Method
Randomization will be carried out by a computer.
Randomization Unit
There are multiple forms of randomization:
1) Randomization into the Gender-Known and No-Gender treatments will be across subjects. Thus, this will be at the individual level.
2) We randomize the pairs shown to each evaluator. That is what pair is shown to each evaluator take place at the individual evaluator level.
3) For each evaluator we also randomize the set of evaluation scores shown. While the evaluation sets are pre-determined what evaluation set is shown to each individual is randomized at the individual evaluator level.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
See above, in most cases randomization occurs at the individual level, as such we cluster at the individual level.
Sample size: planned number of observations
Each evaluator is shown 3 pairs. We have 10 pairs of applicants; this means there are 30 possible combination (10*3) that could be shown to evaluators. If we collect 500 evaluators who each see 3 pairs, we can at most have 1500 possible observations, we then divide this by 2 for the gender known and gender not known treatment which gives us 750 obs. This means we have 25 observations for each possible pair combination (750/30 = 25). Total number of evaluator observations will be 1500.
Sample size (or number of clusters) by treatment arms
Gender Known: 750
Gender Unknown: 750
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Assuming the proportion of the study population that would have a value of 1 for the binary outcome in the absence of the treatment is 0.50 and conservatively treating each evaluator as an individual observation, we have a MDE of 0.11 for our main outcome of “Choose to hire”.
IRB

Institutional Review Boards (IRBs)

IRB Name
Monash University Ethics Committee
IRB Approval Date
2018-09-18
IRB Approval Number
14985