Experimental Design Details
Workers will be recruited with a “screening survey,” common on Prolific, that qualifies the worker to participate in future high-paying surveys. In this screening survey, their first session, workers will complete the three screening tasks, and answer questions about their demographics and job history. After all workers have completed the screening survey, their scores and demographics will be aggregated into worker profiles. Workers will be grouped into sets of approximately 120 workers with similar education levels and average quiz scores. Quiz scores are shown as 1-5 stars on a worker profile that are approximate quintiles.
Each worker profile will be viewed by two managers: one sees a profile that includes screening scores and education, and another sees a profile that also includes race, gender, and age. Managers will each assign 10 of the workers that they evaluate (8 percent) to the harder task. The algorithm will also be used to determine which ten workers in each group of 120 is likely to do the best job on the harder proofreading task given their scores on the screening tasks. Thus, each worker will be evaluated by all three job assignment mechanisms.
After all workers have been evaluated (2-3 weeks after workers complete the screening task), any worker who is assigned to the harder task by any of the three mechanisms will be assigned to the harder task. They will do the harder task and finish the experiment (expected to be 15-20 percent of workers, capped at 24 percent by construction). Primarily, this removes any concern about selection in the remaining sample, since workers in the remaining sample of interest were all assigned to the easier task by all three mechanisms.
Among this sample of interest, after agreeing to take the follow-up survey, workers will be randomly assigned which mechanism they are told was the one responsible for assigning them to the easier, lower-paying task. This revelation will be subtle: workers will be shown the profile of information that the manager or algorithm had available about them when they made their decision, along with the profiles of several of their coworkers, one of whom was assigned to the harder task by their manager and four of whom were also assigned to the easy task. Forty percent of the sample will be told they were assigned by a manager with access to demographics, forty percent will be told they were assigned by a manager without access to demographics, and twenty percent will be told they were assigned by an algorithm.This split is required to ensure that the two-stage least squares estimates (that use only the sample assigned to one of the manager arms) are well-powered. The design is also well-powered to detect intent-to-treat estimates of the effect of the algorithm. Randomization will be stratified by race and gender to ensure the feasibility of estimating heterogeneous treatment effects, described below.
After they are told about how they were assigned, workers will be asked how many stars they think they would have needed to score on the screening quizzes in order to be assigned to the harder task by their manager. After they answer this question, they will be asked to imagine that they are a worker with a different (fictitious) profile with randomly assigned characteristics, and asked how many stars they think they would have needed to score on the screening quizzes in order to be assigned to the harder task. Differences between these answers for fictitious workers of different races and genders provides a measure of implicit perceived discrimination.
Then, workers will do the easier proofreading task. Workers will know that they have to proofread at least six paragraphs to receive their completion payment and that they are able to proofread up to eighteen paragraphs (each for a bonus). If they proofread all eighteen paragraphs, they will be eligible to be evaluated again to do the harder task for a higher wage in a future survey (though they could also be assigned again to the easier task). After finishing the easier proofreading task, workers will be asked whether they would like to be evaluated again and assigned to a future proofreading survey. Ten percent of workers will be randomly selected and their choices implemented.
Next, workers in a manager arm will be asked at what wage they would want to work together with their manager on a similar task in the future, how much they would be willing to give up in wages to be able to choose their own manager (instead of a default of working with the same manager who assigned them in the main experiment), and how they would share a thank-you bonus with their manager. Each of these choices will be implemented for a randomly selected subset of participants.
Then, workers will answer questions about their self-efficacy to do the easier or harder job, job satisfaction, affective well-being, complaints about the promotion process, whether they think they would have been assigned to the harder task if they were evaluated by each of the two other mechanisms or if they were assigned by the same mechanism but had a different race or gender (explicit measures of perceived discrimination).