Experimental Design Details
I will recruit groups of 16 participants to form societies who were initially randomly divided into two group-identities defined by their colour: green and purple. (Green and purple are used to separate colours from any real affiliations such as politics or gender.) They are also assigned a productivity value which is loosely assigned from a normal distribution of mean 50 and standard deviation 81.85. (These parameters were calibrated to fit the existence of multiple equilibria and a misattribution equilibrium.) To simplify matters for the participants, I inform them that there are 1000 balls in an urn, each with an integer written on them. Half of the balls have a number below 50 and half at or above 50. Negative numbers exist. Participants will be presented with a tool for which they can slide a bar along and see how many of the balls in the urn have a number higher and lower than their selected value. This tool will be visible at all times.
Participants are given a choice on how they wish to earn their bonus payment. They may take a bonus equal to their productivity value in tokens or may apply for one of three jobs which pays a fixed bonus payment of 90 tokens. Participants who unsuccessfully apply are given a bonus payment equal to 0 tokens.
The job application evaluation process is as follows. A single employer wishes to hire 3 workers from the applicant pool. The employer’s profits are positively influenced by the productivity value of the workers they hire. However, the employer is unable to observe the true productivity value of applicants at the point of evaluation. Instead, they observe two features of each applicant: their group identity (green or purple) and a noisy but unbiased signal of the applicants’ true productivity value. The signal is the sum of the productivity value and a stochastic noise element which has a commonly known normal distribution of mean 0 and standard deviation of 81.85. (Participants are informed that for each ball in the first urn, there is a second urn which consists of 1000 balls, each with an integer centred around the first ball. Similar explanations are presented as per the first urn. Moreover, a sliding tool is provided again. Given the participant’s first ball of value x, participants can use a second tool to slide along to see how many balls in their second urn are at or below a certain number.)
The employer uses a robot to conduct the evaluation and hiring process on their behalf. Before the experiment, a random draw determines which one of two robot types is employed throughout the whole procedure. Robot A systematically adjusts signals according to the colour of the applicant: it adds 10 to signals from purple applicants and subtracts 90 from signals of green applicants, and then hires the 3 applicants with the highest adjusted signal values. Robot B hires the applicants that it predicts have the highest productivity given the estimated conditional distributions of productivity values among the likely applicants from each colour computed using the information and incentives given to applicants such as the distribution of types and the colours of previous hires, as well as the distribution of signals in previous rounds. Evidently, robot A practises a naïve taste-based discrimination while robot B employs a fair sophisticated inference which could lead to statistical discrimination should application strategies differ by groups. The description of each robot is left purposefully vague to allow for subsequent treatments to fill in these gaps. In all sessions, robot B is always employed.
Participants are first asked whether they wish to apply for a job or not. They are then incentivised to guess the maximum productivity value among the applicants from each group identity. (A binary scoring rule is employed wherein participants receive 50 tokens if they are within 10 of the true value, 25 tokens if they are within 20 of the true value and nothing otherwise.) After the hiring decisions have been made and the information is given to participants, they are then incentivised to guess which robot type has been employed. To incentivise this guess, I give participants 100 tokens and ask them to distribute these between the two robot types. They receive the coins that they placed under the correct robot type as bonus payment.
Participants repeat this process for twelve rounds. Across all rounds, the participant's group identity and the robot type employed are fixed. In each round, each participant draws a new productivity value. After all rounds, participants are paid their bonus token balance from one of the rounds chosen at random at an exchange rate of 100 tokens = 2 British pounds.
Treatments
Treatments are assigned on the society level and are summarised in a table (not shown). Each session contains two societies each with a randomly drawn treatment. The baseline treatment allows me to test for the mis-attribution of discrimination. I employ two further treatments which allow me to test for the causes and consequences of such respectively.
In the first treatment, I investigate why individuals misattribute discrimination to taste-based sources by relieving participants of the strategic uncertainty in applications. Specifically, I am using this treatment to test whether cursedness leads to misattribution. I hypothesise that this uncertainty leads to misperceptions of the underlying mechanics of the discrimination. In this treatment condition, AVG, I inform participants of the true maximum ability of applicants of each coloured identity after they predict such and before they predict the robot used. This broadens the scope of the information structure I give to participants to include both the realised outcomes of the game but also the strategies played by others. In this treatment, I aim to convey to participants that if they believe that robot B may discriminate when average abilities among applicants differ, then such discrimination may also be occurring.
To test for the consequences of mis-attribution, I employ the treatment INF, wherein I inform participants after round 6 that they have been interacting with robot B and therefore, that the discrimination thus far is statistical in nature. After this round in all treatment arms, I measure two attributes and compare individual responses to the same measures conducted in the baseline treatment. First, I measure individual willingness to pay (WTP) for one-round of affirmative action (AA) wherein the employer is forced to hire at least one worker of each group. Under the policy, a coin is flipped to determine a colour. The side the coin lands on dictates which colour must constitute 2 of the hires, and which colour must constitute one of the hires. This decision is incentivised by the following procedure: participants are free to state a number between 1 and 50; a random participant is chosen and a number x is drawn between 1 and 50. If the number is above the participant’s stated number, then affirmative action is not enacted in the following round. If the number picked is below the number stated by the participant then affirmative action will occur in the next period and the participant is deducted x tokens from their final payment. (This incentivisation method is inspired by the Becker-DeGroot-Marschak mechanism (Becker et al., 1964), however, I take away the strategic uncertainty inherent in the allocation mechanism. Thus, I reduce the incentive to overestimate true valuations via mechanisms like the winner’s curse.)
The second measure establishes participants' perception of the fairness of the situation they are in. I employ this measure directly after the willingness to pay elicitation but before results are finalised. I ask participants to rank, on a 5-point Likert scale, to what extent they consider the hiring process thus far to be fair.
Post-experiment survey
One may be concerned that my measure of WTP for AA may not appropriately capture the intended preferences. Thus, I follow up this measure with a post-experiment survey. In this survey, I ask participants how effective a one-period quota would have increased the share of green hires and applicants in later rounds. I also employ a risk assessment task as pe Gneezy and Potters (1997): individuals are given 100 tokens and asked how many of these coins they wish to invest in a lottery which pays 2.5 times the investment with a probability of one third and nothing otherwise.