AEA RCT Registry

Fields Changed

Registration

Field	Before	After
Abstract	Artificial intelligence (AI) is ubiquitous and affects human behavior in many ways. Some studies have examined the benefits of AI in increasing the accuracy of human decision making, as well as people’s attitudes toward AI. However, experimental evidence on the effect of AI on individual preference is limited. This project investigates people’s risk-taking behavior under AI-based recommendations. In our experiment, subjects make binary choices between two lotteries that differ in consequences and probabilities. We construct a set of choice problems to systematically examine risk-taking behavior, in terms of satisfaction of decision axioms, exhibition of typical paradoxes, fitness of models on decision theory, and so on. The experiment consists of three stages. In Stage 1, subjects make decisions without recommendation. In Stage 2, subjects are randomly divided into five groups. Two baseline groups provide no recommendation or random recommendation. Three treatment groups provide recommendations based on user-based collaborative filtering, item-based collaborative filtering, or the popularity of options, which are representative methods used in the field. Subjects make their own decisions after they receive the recommendation if any. After Stage 2, subjects in groups with recommendations are asked to pay to receive/avoid the recommendation for the next stage, whereby they have another 60 choice problems. This experiment allows us to compare risk-taking behavior and attitudes toward AI among groups.	Artificial intelligence (AI) is ubiquitous and affects human behavior in many ways. Some studies have examined the benefits of AI in increasing the accuracy of human decision making, as well as people’s attitudes toward AI. However, experimental evidence on the effect of AI on individual preference is limited. This project investigates people’s risk-taking behavior under AI-based recommendations. In our experiment, subjects make binary choices between two lotteries that differ in consequences and probabilities. We construct a set of choice problems to systematically examine risk-taking behavior, in terms of satisfaction of decision axioms, exhibition of typical paradoxes, fitness of models on decision theory, and so on. The experiment consists of three stages. In Stage 1, subjects make decisions without recommendation. In Stage 2, subjects are randomly divided into five groups. Two baseline groups provide no recommendation or random recommendation. Three treatment groups provide recommendations based on user-based collaborative filtering, random forest, or the popularity of options, which are representative methods used in the field. Subjects make their own decisions after they receive the recommendation if any. After Stage 2, subjects in groups with recommendations are asked to pay to receive/avoid the recommendation for the next stage, whereby they have another 60 choice problems. This experiment allows us to compare risk-taking behavior and attitudes toward AI among groups.
Last Published	March 21, 2022 01:20 PM	April 03, 2022 09:59 PM
Experimental Design (Public)	This project aims at investigating whether and how AI-based recommendations affect risk-taking behavior. Decision Task Risk-taking behavior has been widely explored through choice problems, whereby subjects make binary choices between two prospects that differ in consequences and probabilities. Choice problems provide an experimental framework, whereby we can systematically observe individual risk-taking behavior and construct measures for comparison. Specifically, subjects in our experiment choose between Option A and Option B. Options can be either certain or uncertain payment, with at most three possible consequences. Option A is described by a list of parameters, {xA1, pA1, xA2, pA2, xA3, pA3}. That is, Option A yields xA1 with probability pA1, xA2 with probability pA2, and xA3 with probability pA3. Similarly, we can describe Option B with another list of parameters. We design our parameters in two manners that serve different analytical purposes. The first manner employs the concept of probability triangle popularized by Machina (1982), which is a framework to represent the choice space over lotteries. To adopt this concept, we first fix the consequences x1, x2, and x3 in ascending order. Then we can specify any combination of the corresponding probabilities p1, p2, and p3 in the probability triangle, where p1=0 on the vertical edge, p2=0 on the hypotenuse edge, and p3=0 on the horizontal edge. In this way, we can construct and represent different lotteries (different p1, p2, and p3). This triangle has the advantage of explicitly presenting the relationships among lotteries, which makes it efficient for us to test different axioms of decision making under uncertainty and different paradoxes. In total, we construct 11 lotteries, based on which we design 15 binary choice problems. Through these problems, we will be able to test whether subjects’ preferences respect first-order stochastic dominance (FOSD), second-order stochastic dominance (SOSD), the betweenness axiom, the transitivity axiom, and consistency and whether subjects exhibit common ratio effect and common consequence effect. Also, we can easily replicate this structure (the list of p1, p2, and p3) by changing the consequences x1, x2, and x3. The second manner is a randomization algorithm. That is, we randomly generate lotteries in a given space and construct choice problems accordingly. Different from the first manner whereby observations can be easily described and classified by axioms and paradoxes, in the second manner, individual decisions will be used to estimate structural models based on fundamental decision theories. From this perspective, randomly generating lotteries helps reduce the risk of overfitting models (Erev et al., 2017). Results of estimation will be used to discuss model fitness, out-sample prediction power, and characteristics of the estimated model (such as the degree of risk aversion) Pre-Experiment Before our main experiment, we conduct a pre-experiment which consists of three stages. Each stage has 60 choice problems, in which 30 problems are constructed through two probability triangles and 30 problems are constructed through the randomization algorithm. Subjects make decisions without recommendation. This pre-experiment aims at collecting data in preparation for the AI-based recommendation to function. We will explain how we use this data set, when we introduce the recommendation method we adopt in the experiment. Experiment Our experiment follows the same structure as the pre-experiment and uses the same 180 choice problems. In Stage 1, all subjects conduct the 60 choice problems without recommendation. In Stage 2, subjects will be randomly assigned to five different groups: (1) Control – subjects receive no recommendation; (2) Random – subjects receive recommendation generated randomly; (3) UBCF – subjects receive recommendation according to user-based collaborative filtering (UBCF); (4) IBCF – subjects receive recommendation according to item-based collaborative filtering (IBCF); (5) Majority – subjects receive the recommendation of the modal decision of each problem observed in the pre-experiment. Subjects make their own decisions after they receive the recommendation if any. We are interested in the comparison of behavior among groups. Group Control is a baseline condition that captures the behavior without recommendation. Group Random is another baseline condition, whereby recommendation exists but has no instrumental value. Groups UBCF and IBCF are two main treatment conditions that provide AI-based recommendations. Collaborative filtering is one of the most common techniques for recommendations. Several popular websites including Amazon, YouTube, and Netflix use collaborative filtering as part of their sophisticated recommendation system. Here we incorporate two types of collaborative filtering. Note that we have a set of subjects who finish all 180 choice problems without the recommendation in the pre-experiment, denoted as the dataset subjects. Intuitively speaking, for a targeted subject in the experiment, the method UBCF compares her decisions in Stage 1 and the dataset subjects’ decision in Stage 1 and finds similar subjects of this targeted subject. Then, given a problem in Stage 2, UBCF recommends a weighted average of decisions from the identified similar subjects, for the target subject. Instead of searching for similar subjects, the method IBCF searches for similar problems. For a targeted problem in Stage 2, IBCF compares the dataset subjects’ decisions in this problem with those of all problems in Stage 1 and finds similar problems of this targeted problem. Then, given a subject in Stage 2, IBCF recommends a weighted average of decisions of this subject in the identified similar problems, for the target problem. Group Majority captures another common recommendation method in the field, based on the popularity of options. After Stage 2, for these four treatment groups, we elicit subjects’ valuation on the recommendation. Specifically, subjects know that there will be another 60 choice problems in Stage 3 and they choose to pay to receive/avoid the recommendation. Stage 3 serves the purpose of realizing subjects' decisions of valuation. The experiment ends with a survey.	This project aims at investigating whether and how AI-based recommendations affect risk-taking behavior. Decision Task Risk-taking behavior has been widely explored through choice problems, whereby subjects make binary choices between two prospects that differ in consequences and probabilities. Choice problems provide an experimental framework, whereby we can systematically observe individual risk-taking behavior and construct measures for comparison. Specifically, subjects in our experiment choose between Option A and Option B. Options can be either certain or uncertain payment, with at most three possible consequences. Option A is described by a list of parameters, {xA1, pA1, xA2, pA2, xA3, pA3}. That is, Option A yields xA1 with probability pA1, xA2 with probability pA2, and xA3 with probability pA3. Similarly, we can describe Option B with another list of parameters. We design our parameters in two manners that serve different analytical purposes. The first manner employs the concept of probability triangle popularized by Machina (1982), which is a framework to represent the choice space over lotteries. To adopt this concept, we first fix the consequences x1, x2, and x3 in ascending order. Then we can specify any combination of the corresponding probabilities p1, p2, and p3 in the probability triangle, where p1=0 on the vertical edge, p2=0 on the hypotenuse edge, and p3=0 on the horizontal edge. In this way, we can construct and represent different lotteries (different p1, p2, and p3). This triangle has the advantage of explicitly presenting the relationships among lotteries, which makes it efficient for us to test different axioms of decision making under uncertainty and different paradoxes. In total, we construct 11 lotteries, based on which we design 15 binary choice problems. Through these problems, we will be able to test whether subjects’ preferences respect first-order stochastic dominance (FOSD), second-order stochastic dominance (SOSD), the betweenness axiom, the transitivity axiom, and consistency and whether subjects exhibit common ratio effect and common consequence effect. Also, we can easily replicate this structure (the list of p1, p2, and p3) by changing the consequences x1, x2, and x3. The second manner is a randomization algorithm. That is, we randomly generate lotteries in a given space and construct choice problems accordingly. Different from the first manner whereby observations can be easily described and classified by axioms and paradoxes, in the second manner, individual decisions will be used to estimate structural models based on fundamental decision theories. From this perspective, randomly generating lotteries helps reduce the risk of overfitting models (Erev et al., 2017). Results of estimation will be used to discuss model fitness, out-sample prediction power, and characteristics of the estimated model (such as the degree of risk aversion) Pre-Experiment Before our main experiment, we conduct a pre-experiment which consists of three stages. Each stage has 60 choice problems, in which 30 problems are constructed through two probability triangles and 30 problems are constructed through the randomization algorithm. Subjects make decisions without recommendation. This pre-experiment aims at collecting data in preparation for the AI-based recommendation to function. We will explain how we use this data set, when we introduce the recommendation method we adopt in the experiment. Experiment Our experiment follows the same structure as the pre-experiment and uses the same 180 choice problems. In Stage 1, all subjects conduct the 60 choice problems without recommendation. In Stage 2, subjects will be randomly assigned to five different groups: (1) Control – subjects receive no recommendation; (2) Random – subjects receive recommendation generated randomly; (3) UBCF – subjects receive recommendation according to user-based collaborative filtering (UBCF); (4) RF – subjects receive recommendation according to the method of random forest (RF); (5) Majority – subjects receive the recommendation of the modal decision of each problem observed in the pre-experiment. Subjects make their own decisions after they receive the recommendation if any. We are interested in the comparison of behavior among groups. Group Control is a baseline condition that captures the behavior without recommendation. Group Random is another baseline condition, whereby recommendation exists but has no instrumental value. Groups UBCF is one main treatment condition that provides AI-based recommendations. Collaborative filtering is one of the most common techniques for recommendations. Several popular websites including Amazon, YouTube, and Netflix use collaborative filtering as part of their sophisticated recommendation system. Here we incorporate one type of collaborative filtering. Note that we have a set of subjects who finish all 180 choice problems without the recommendation in the pre-experiment, denoted as the dataset subjects. Intuitively speaking, for a targeted subject in the experiment, the method UBCF compares her decisions in Stage 1 and the dataset subjects’ decisions in Stage 1 and finds similar subjects of this targeted subject. Then, given a problem in Stage 2, UBCF recommends a weighted average of decisions from the identified similar subjects, for the target subject. Group RF is another main treatment condition that provides AI-based recommendations. Based on subjects' own decisions in Stage 1, this method identifies important features for making decisions for each of the subjects and recommends accordingly. Group Majority captures another common recommendation method in the field, based on the popularity of options. After Stage 2, for these four treatment groups, we elicit subjects’ valuation on the recommendation. Specifically, subjects know that there will be another 60 choice problems in Stage 3 and they choose to pay to receive/avoid the recommendation. Stage 3 serves the purpose of realizing subjects' decisions of valuation. The experiment ends with a survey.