Experimental Design Details
Experiment 1:
Experiment 1 consists of two blocks.
Block 1 is designed to estimate individual-level selection neglect and incorrect models. Each round, participants see information about two neighborhoods, each with a number of crimes drawn from two city-specific distributions (neighborhood A1 is a draw from city distribution A, neighborhood B1 is an independent draw from city distribution B). Importantly, the number of crimes is unobserved to participants, who instead see the following information:
- Number of reported crimes.
- Reporting rate (10-90%): average proportion of crimes that is reported. Drawn from uniform distribution [10,90] but only with "easy" numbers, to reduce cognitive requirements (multiples of 5 such as 10,20,25,35...).
- Average age (20-40): of the neighborhood residents.
- Unemployment rate (5-20%).
- Socioeconomic status (1-6).
Participants know they can infer the number of crimes just from the reporting rate and the reported crimes. All the other variables are independent draws from different distributions, so there's no relationship between these variables and crime. After seeing this information, participants choose which neighborhood to patrol, and are incentivized to patrol the neighborhood with the highest number of crimes (reported and unreported). After 9 choices, participants are informed they will only be able to see one piece of information apart from the number of reported crimes for the last decision, and asked to choose which variable they'd like to see. Then, they make a last decision with those two pieces of information.
This design allows us to identify (1) whether participants account for reported crime data selection, (2) which variables have more weight in their patrolling decisions, (3) which variables they identify as most informative of the number of crimes.
---
Block 2 uses a similar design but focuses on reported crimes and reporting rates. Now, participants see neighborhoods coming from 2 different cities, which have the same distribution of crimes (but participants don't know). In Block 2, participants only see the number of reported crimes and the reporting rate of each neighborhood, and make a patrolling decision based on that information. As in Block 1, they are incentivized to patrol the neighborhood with the highest number of crimes, and they make 15 decisions through the block. At the end of those 15 decisions, participants are incentivized to guess which city has the highest average number of crimes and by how much (up 5% more, between 5-15%, between 15-25%, more than 25% more).
We randomize participants into two groups for all the 15 decisions. For those in the Orthogonal group, the reporting rates of both neighborhoods are drawn from the same uniform distribution [10,90], as in Block 1. For those in the Differential group, the reporting rates of one city are drawn from a uniform [40,90], while those of the other city are drawn from [10,60]. This allows us to identify whether differential data selection generates statistical discrimination and overpolicing.
Additionally, for the last 7 decisions of the block we randomize participants into two groups. Those in the Exogenous group will see reporting rates drawn in the same way described above, keeping the Orthogonal/Differential treatment arm. For those in the Endogenous group, reporting rates will depend on their previous decision. We first draw all reporting rates in the same manner as for the exogenous group, but will add a "Reporting Bonus" of 15% to the next neighborhood of the city participants decide to patrol, and substract 15% from the next neighborhood of the city participants decide to not patrol. This mimics dynamics of data selection where patrolling generates more crime data (more police reports and arrests) which are then used to predict crime in the next turn.
Finally, for the last 7 decisions of the block we randomize participants into 3 groups. Those in the Algorithm group see a recommendation made by an algorithm that takes the current reported crimes data, adjusts it by reporting rate, and averages it with the average of the last 3 neighborhoods drawn from that city. The algorithm recommends to patrol the neighborhood with the highest average crime. Those in the Algorithm Neglect group see a recommendation from an algorithm that does exactly the same but without adjusting by reporting rates. Those in Control see no recommendation. Both algorithms use previous data, although this data is non-informative conditional on the current data, to mimic the functioning of hotspot and predictive policing algorithms. These interventions allow us to measure whether predictive policing algorithms can abate or reinforce cognitive biases.
At the end of Block 2, participants will make an additional patrolling decision, between neighborhoods coming from the same cities as in the rest of the block. Participants now choose while only seeing the number of reports. After this choice, they have the opportunity to buy the reporting rate of each neighborhood, for a randomly selected price. If they do, the price will only be discounted from their performance bonus in that decision if they make the correct choice. In other words, they will never lose money by buying. If they decide to "buy", they will repeat the decision with the new information. If they don't buy, Block 2 ends.
-----------------
Experiment 2
In the first block of the experiment, participants are shown the distribution of crime levels across neighborhoods within a given city. Then, over the course of 20 rounds, they observe both the number of reported crimes in a specific neighborhood and the reporting rate for that neighborhood. The number of reports is constructed as a noisy signal of both the true crime level and the reporting rate. Using this information, participants are asked to estimate the underlying crime level of the neighborhood in each round.
We model this belief-updating process within a Bayesian framework, which allows us to estimate two key parameters: (1) the degree of selection neglect, i.e., the extent to which participants fail to properly account for the reporting rate when inferring crime levels; and (2) the relative weight participants place on prior versus posterior beliefs.
In the second block, participants are presented with information from neighborhoods in two different cities. For each neighborhood, they receive only two pieces of information: the number of reported crimes and the reporting rate. Based on this, they make an incentivized guess as to which neighborhood has the higher true crime level.
We introduce two interventions in Block 2 to examine whether prediction accuracy improves or deteriorates. In the real-world context of the Colombian police, an intelligence agency analyzes crime reports to infer differences in crime levels across neighborhoods and uses this to guide patrol allocation. We leverage this institutional feature to conduct an environmentally valid intervention. Specifically, we inform participants of the intelligence agency’s recommendation regarding which neighborhood has the highest level of criminality.
In one treatment, the agency provides an incorrect recommendation (i.e., identifying the lower-crime neighborhood as the higher one), while in the other, it provides the correct recommendation. We then compare prediction accuracy across the control group and each treatment group to assess the influence of external recommendations on officers' decision-making.