Characterizing Firm-Level Discrimination

Last registered on January 12, 2020

Pre-Trial

Trial Information

General Information

Title
Characterizing Firm-Level Discrimination
RCT ID
AEARCTR-0004739
Initial registration date
September 20, 2019

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
September 23, 2019, 4:09 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
January 12, 2020, 5:14 PM EST

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
UC Berkeley

Other Primary Investigator(s)

PI Affiliation
UC Berkeley
PI Affiliation
UC Berkeley

Additional Trial Information

Status
In development
Start date
2019-10-01
End date
2021-10-01
Secondary IDs
Abstract
We propose a stratified correspondence study of large American employers aimed at detecting firm-level discrimination on the basis of race, sex, and age. To construct accurate firm-level estimates, 125 (geographically distinct) job vacancies will be sampled from each employer and 8 fictitious applications will be sent to each job. Employer responses to the experiment will be used to measure the degree to which discriminatory jobs are clustered in particular firms. We will then characterize how discriminatory behavior covaries with firm and establishment level characteristics.
External Link(s)

Registration Citation

Citation
Kline, Patrick, Evan Rose and Christopher Walters. 2020. "Characterizing Firm-Level Discrimination." AEA RCT Registry. January 12. https://doi.org/10.1257/rct.4739-1.2000000000000002
Experimental Details

Interventions

Intervention(s)
We will conduct a resume correspondence study that randomly assigns race, gender, and age to job applications submitted to a large number of job vacancies. We will study variation in gaps in callback rates across these groups to determine the extent to which discriminatory jobs are clustered in particular firms.
Intervention Start Date
2019-10-01
Intervention End Date
2021-10-01

Primary Outcomes

Primary Outcomes (end points)
The key outcomes are callbacks from employers to applications. We are interested in mean differences in callback rates across legally protected categories as well as between-firm and within-firm variation in these differences.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
Our experiment will send 8 resumes with randomly assigned characteristics to 125 job vacancies at each of a set of large firms. Race will be indicated using racially distinctive names. Each job will receive 4 white and 4 black applicants. Gender (male or female) and age (above or below age 40) will be randomly assigned to each resume.
Experimental Design Details
We have assembled a list of the largest employers in the US based upon the Fortune 500. Where possible, we have split large holding companies into their constituent subsidiaries, since these entities post vacancies separately. We have selected the 100 largest firms that are feasible to audit. These firms regularly post entry-level jobs through an easily accessible online portal and are expected have vacancies in at least 125 distinct counties.

Our experiment will randomly sample 125 job vacancies for each of these firms, each in a distinct county. In cases where a firm has multiple vacancies open in a US county, we will select the job posted most recently to minimize the chances of applying to a job that has already been filled. Previous audit studies (and a pilot described below) demonstrate that sending up to 8 fictitious applications per job is feasible without employer detection (see, e.g., Banerjee et al., 2009 and Arceo-Gomez and Campos-Vasquez, 2014). Sending eight applications should enable us to precisely quantify job-level variation in discrimination within a given firm. We will stratify our analysis on race, sending four distinctively white and four distinctively black applications to each vacancy. Because our proposed design entails sampling 125 vacancies per firm, we expect to send 1,000 applications to each firm, for a total of 100,000 applications.
We have developed a software platform that constructs fictitious applicant identities. An important step in the process is the choice of racially distinctive names. We choose female and male first names from the lists in Bertrand and Mullainathan (2004). For last names we use the Decennial Census, selecting the 10 surnames with the highest racial shares among those that occur at least 10,000 times nationally for each race group, mimicking BM’s selection strategy. With BM’s nine first names for each race and gender, this provides 90 unique full names for each of the four race and gender groups. To avoid applying to the same firm with the same name more than once, we supplement this list using speeding ticket records from North Carolina. We select the most common first names with > 90% racial share among speeders ticketed between 2006 and 2018 and born between 1974 and 1979. We add 10 new names for race and gender group from this data, skipping over those already included. Fourty-seven percent of BM’s first names are encountered while doing so, suggesting that this data source captures a highly similar set of names. We combine these names with 16 last names from Census data selected according to the same criteria so that we have a total of 1000 unique full names for reach race and gender group between the two sources.

Our software generates a fictitious application specific to each job. It begins by randomly assigning a race and gender to the application, stratifying so that each job receives four white and four black applications. A race- and gender- concordant name is then assigned, sampling without replacement within a firm. The software also assigns each applicant a random date of birth that implies an age between 22 and 58 years old with uniform probability. If the vacancy requires a social security number, the software assigns one from a sample of SSNs assigned to the deceased (these SSNs are made publicly available online by the Social Security Administration).
Our fictitious applicants have phone numbers automatically provisioned by the software using a service called Twilio. Any calls to the number are sent to voicemail. Any voicemail messages received are transcribed and emailed to a central account we use to collect all callback data. Any text messages are also forwarded. The software ensures that no two applications to the same firm share a phone number. Applicants have a Gmail email address we created based on their first name, last name, and a random string of integers. There is one email address for each first name / last name combination, ensuring that the same email address is not used for multiple applications to the same firm. Applicants also have a home address that we draw randomly from data publicly available from openaddresses.io. We choose a random home address in a zip code close to the vacancy.

Each applicant graduated the year they turned 18 from a public high school drawn from database freely available from the Department of Education. We chose a school in zip codes nearby the focal job and provide the selected school’s actual street address if required. Half of the applicants also have an Associate Degree (AD). We assign AD institutions from a Department of Education database listing all degree-granting institutions in the U.S. We randomly select a two-year degree granting college in zip codes near the vacancy and a major from a list of common AD majors.

Each application also has a randomly generated employment history consisting of two to three previous jobs lasting between nine and 24 months each, with no interrupting unemployment spells. Previous employers are drawn randomly from a Reference USA database of existing firms across the country. We choose previous employers located close to the vacancy with appropriately similar job types. For example, retail job applicants have employment history at local restaurants and retailers. Each job has an appropriate position (e.g., “cashier”) and a supervisor with a name randomly chosen from a list of the most common in the U.S. Each job uses the actual address of the firm sampled from the Reference USA database and a randomly generated phone number. Each job also is assigned a set of 2-3 duties from a database of example resumes scraped from jobhero.com.

Our software supplies all application information through an online web portal. When required, the software can also generate single page PDF resumes with randomly varying fonts, spacing, section ordering, and layouts to ensure no two PDFs have identical formatting. These PDFs will be uploaded to job application portals when required.

We will submit applications by proceeding through our firm list in random order, sampling 25 establishments with 8 applications each at a time for each firm. Given that we plan to sample 125 total establishments per firm, the experiment will require five total iterations through the complete list. This strategy ensures that all firms are sampled over the course of the experiment while allowing us to concentrate our application effort on a handful of firms at a time.


Randomization Method
All resume characteristics will be randomly assigned by computer as part of our resume generation software.
Randomization Unit
Race, gender and age will be randomly assigned to resumes. Race assignments will be stratified so that half of applicants are white and half are black at each job. Other resume characteristics will be unconditionally randomly assigned.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
100,000 total resumes sent to 12,500 jobs.
Sample size: planned number of observations
100,000 resumes
Sample size (or number of clusters) by treatment arms
We will have 50,000 resumes with white names and 50,000 resumes with black names. We will also have approximately 50,000 male and 50,000 female, as well as 50,000 above age 40 and 50,000 below, though the randomization does not ensure exact balance by gender and age.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
The minimal detectable effect size for the mean difference between black and white callback rates is 0.004. We are also interested in the share of variation in this difference across jobs due to between-firm differences. The minimum detectable effect size for the between-firm variance share is approximately 5 percent.
IRB

Institutional Review Boards (IRBs)

IRB Name
UC Berkeley Committee for the Protection of Human Subjects
IRB Approval Date
2019-04-19
IRB Approval Number
2019-04-12080
Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials