Back to History

Fields Changed

Registration

Field Before After
Abstract Algorithmic decision support tools used in high-stakes domains such as criminal justice, hiring, and lending often exclude protected variables from input data to comply with anti-discrimination regulations and fairness principles. While these interventions are typically assessed based on quantitative disparity measures, their behavioral implications are less explored. This experiment examines how human decision makers, who are assisted by these tools in making accurate decisions about others, respond to algorithmic predictions that omit protected group membership, and whether these reactions are related to their prior statistical beliefs about protected groups. It examines how such interventions affect belief updating, and how this in turn affects discrimination in subsequent decision outcomes. The experiment uses a hiring context in which participants predict the performance of individual workers on a math and science quiz, both before and after receiving algorithmic performance predictions for each worker. The workers are selected to be balanced by gender, with otherwise identical characteristics, allowing for the measurement of prior statistical beliefs about gender differences in quiz performance. The algorithm’s input data varies between subjects regarding the inclusion of the gender variable. Participants are informed about the input data. Prediction results remain constant as gender is neither a significant predictor nor correlated with significant predictors. Algorithmic decision support tools used in high-stakes domains such as criminal justice, hiring, and lending often exclude protected variables from input data to comply with anti-discrimination regulations and fairness principles. While these interventions are typically assessed based on quantitative disparity measures, their behavioral implications are less explored. This experiment examines how human decision makers, who are assisted by these tools in making accurate decisions about others, respond to algorithmic predictions that omit protected group membership, and whether these reactions are related to their prior statistical beliefs about protected groups. It examines how such interventions affect belief updating, and how this in turn affects discrimination in subsequent decision outcomes. The experiment uses a hiring context in which participants predict the performance of individual workers on a math and science quiz, both before and after receiving algorithmic performance predictions for each worker. The workers are selected to be balanced by gender, with otherwise identical characteristics, allowing for the measurement of prior statistical beliefs about gender differences in quiz performance. The algorithm’s input data varies between subjects regarding the inclusion of the gender variable. Participants are informed about the input data. Prediction results remain constant as gender is neither a significant predictor nor correlated with significant predictors. Another treatment excludes month of birth (even vs. odd) from the training data, while keeping gender included. By excluding a non-stereotypical variable, the experiment differentiates whether participants’ reactions are specifically driven by the omission of the gender variable, or if their behavior changes due to the exclusion of any variable.
Last Published September 05, 2024 04:15 AM September 26, 2024 01:03 PM
Intervention (Public) Exclusion of the gender variable from the algorithm’s data used for performance predictions.The input data is disclosed to participants. The prediction results and algorithm's accuracy remain constant, as gender is neither a predictor of performance nor correlated with any of the predictors. (i) Exclusion of the gender variable from the algorithm’s data used for performance predictions.The input data is disclosed to participants. The prediction results and algorithm's accuracy remain constant, as gender is neither a predictor of performance nor correlated with any of the predictors. (ii) Exclusion of the month of birth (even vs odd) variable from the algorithm’s data used for performance predictions.The input data is disclosed to participants. The prediction results and algorithm's accuracy remain constant, as month of birth (even vs. odd) is neither a predictor of performance nor correlated with any of the predictors.
Experimental Design (Public) At the start of the experiment, participants are informed that 400 U.S. adults, representative of the U.S. population, took part in an online 20-questions Math and Science Quiz and briefed on the topics covered in the quiz. The experiment is divided into two parts. In each part they can earn an additional $5. At the end of the study, one of these parts will be randomly selected for bonus payment. In Part 1, participants review short CVs of eight pre-selected workers and estimate for each worker the likelihood that their performance is in the top 50% relative to the performance of the other 400 workers ("top performer"). The order in which the workers are presented is randomized. The CVs include information on gender (female/male), level of education (no bachelor's degree / bachelor's degree or higher), and month of birth (even/odd). Participants then receive performance predictions for each worker from a machine learning algorithm and have the opportunity to revise their initial estimates. They are informed that the algorithm is trained on the remaining workers from the full 400-worker sample and uses all the CV variables in the baseline condition (excluding gender in the treatment condition), along with the workers' performance on a prior math and science quiz to make the performance predictions. Performance on the other quiz is strongly correlated with Math and Science Quiz performance, making the predictions informative. Prediction results remain constant as gender is neither a significant predictor nor correlated with significant predictors. Participants are shown the eight workers in the same order as before and must submit their final estimates, which will be used to determine their $5 bonus payment. Part 2 follows immediately, in which participants complete a simple logic task after completing the hiring task. In the hiring task, participants are again presented with the eight workers and are asked to decide whether to hire them, making a yes/no decision for each individual. Participants are told that they will receive $2.50 if they solve the logic task correctly. For the bonus payment, one of the hiring decisions is randomly selected. If the selected decision involves hiring a top performer, the participant earns $5. If the selected decision is to hire a worker who is not a top performer, the participant earns $0. If the selected decision is not to hire the worker, the participant keeps their $2.50. The hired worker in the randomly selected decision receives $2.50 regardless of performance. Participants are then asked to estimate the accuracy of the algorithm (incentivized). The experiment concludes with a brief survey on attitudes toward algorithms and technology, knowledge about algorithms, perceptions of gender discrimination in the U.S., and a demographic questionnaire. Comprehension questions are presented throughout the experiment, which participants must answer to proceed. All beliefs are elicited using the stochastic Becker-DeGroot-Marschak method. The quiz was previously conducted as part of a separate online Math and Science Quiz involving 400 U.S. adults, representative of the U.S. adult population, and consisted exclusively of ASVAB questions. These adults completed two similar 20-question quizzes consecutively. The variables presented in the CVs and used for the algorithm’s predictions were selected based on results from a separate survey of 300 U.S. adults (representative sample) that measured beliefs about differences in Math and Science Quiz performance by gender, education level, and month of birth. The algorithm is trained on the sample of 400 workers, excluding the eight selected workers. The algorithm is based on a logistic regression model. At the start of the experiment, participants are informed that 400 U.S. adults, representative of the U.S. population, took part in an online 20-questions Math and Science Quiz and briefed on the topics covered in the quiz. The experiment is divided into two parts. In each part they can earn an additional $5. At the end of the study, one of these parts will be randomly selected for bonus payment. In Part 1, participants review short CVs of eight pre-selected workers and estimate for each worker the likelihood that their performance is in the top 50% relative to the performance of the other 400 workers ("top performer"). The order in which the workers are presented is randomized. The CVs include information on gender (female/male), level of education (no bachelor's degree / bachelor's degree or higher), and month of birth (even/odd). Participants then receive performance predictions for each worker from a machine learning algorithm and have the opportunity to revise their initial estimates. They are informed that the algorithm is trained on the remaining workers from the full 400-worker sample and uses all the CV variables in the baseline condition (excluding gender in the first treatment condition and exluding month of birth - even vs. odd numbered - in the second treatment), along with the workers' performance on a prior math and science quiz to make the performance predictions. Performance on the other quiz is strongly correlated with Math and Science Quiz performance, making the predictions informative. Prediction results remain constant as gender (month of birth) is neither a significant predictor nor correlated with significant predictors. Participants are shown the eight workers in the same order as before and must submit their final estimates, which will be used to determine their $5 bonus payment. Part 2 follows immediately, in which participants complete a simple logic task after completing the hiring task. In the hiring task, participants are again presented with the eight workers and are asked to decide whether to hire them, making a yes/no decision for each individual. Participants are told that they will receive $2.50 if they solve the logic task correctly. For the bonus payment, one of the hiring decisions is randomly selected. If the selected decision involves hiring a top performer, the participant earns $5. If the selected decision is to hire a worker who is not a top performer, the participant earns $0. If the selected decision is not to hire the worker, the participant keeps their $2.50. The hired worker in the randomly selected decision receives $2.50 regardless of performance. Participants are then asked to estimate the accuracy of the algorithm (incentivized). The experiment concludes with a brief survey on attitudes toward algorithms and technology, knowledge about algorithms, perceptions of gender discrimination in the U.S., and a demographic questionnaire. Comprehension questions are presented throughout the experiment, which participants must answer to proceed. All beliefs are elicited using the stochastic Becker-DeGroot-Marschak method. The quiz was previously conducted as part of a separate online Math and Science Quiz involving 400 U.S. adults, representative of the U.S. adult population, and consisted exclusively of ASVAB questions. These adults completed two similar 20-question quizzes consecutively. The variables presented in the CVs and used for the algorithm’s predictions were selected based on results from a separate survey of 300 U.S. adults (representative sample) that measured beliefs about differences in Math and Science Quiz performance by gender, education level, and month of birth. The algorithm is trained on the sample of 400 workers, excluding the eight selected workers. The algorithm is based on a logistic regression model.
Planned Number of Clusters 2 treatment groups (gender-aware vs. gender-blind algorithm) Baseline: All variables in training data included Treatment 1: Gender excluded Treatment 2: Month of birth (even vs. odd numbered) exluded
Planned Number of Observations 700-900 participants 1100-1350 participants
Back to top