Experimental Design Details
In the first stage of the experiment, we use a questionnaire to elicit 15 personal characteristics from participants. 12 of these characteristics serve as the basis for predictive assessments. At this point, we do not inform participants about the purpose of the questionnaire. In this way, we can allay concerns that participants give intentionally inaccurate and self-serving answers in order to (perceivably) outsmart or game the system. The questionnaire further contains items that directly measure participants' positive reciprocity. We take proven measures from. In our analyses, these measures will serve as the ground truth that helps us to examine important treatment heterogeneities related to the accuracy of predictive assessments. Thereby, we will consider the actual empirical correction in our sample between baseline participants' answer to this question and their actual repayment behavior in stage 2. We show the 12 traits in use in the appendix.
\subsection{Stage 2}
Stage 2 comprises a one-shot investment game that has the following basic structure. There are two parties: an investor and a recipient. The investor has 10 monetary units (MU) and begins with deciding whether to keep or invest the entire 10 MU with the recipient. If she keeps the 10 MU, the game ends leaving her and the recipient with a payoff of 10 MU and 0 MU, respectively. If she decides to invest, the recipient receives triple the amount, i.e., 30 MU. The recipient is free to keep the whole amount without repercussion. Crucially, however, the recipient has the option to repay the investor x \in [0,30] MU, thereby reciprocating the investors initial trust. The investor and recipient payoffs equal x MU and 30-x MU, respectively. With this structure the investment game closely mirrors sequential human transactions that require both, trust by first moving party (e.g., loan officer, HR manager, supplier) and reciprocity by the second moving party (e.g., borrower, worker, buyer), especially in incomplete contract situations. Notably, participants in our main study always play in the recipient role and must indicate how many MU they will return to an investor who initially invests 10 MU with them. After making the repayment decision, we ask participants to indicate what they believe the investor expects to be repaid. If their guess does not deviate from the actual belief of the investor by more than 5 units, participants earn 5 MU.
We introduce our between-subject treatment variation before participants make their decisions. In our baseline condition (NoDisc), participants do not receive any additional information. Participants in our two main treatment conditions, however, observe a proper ML model's predictive assessment about whether or not they are a selfish person who is not expected to repay trust. To provide a reference point and control beliefs, we explain to participants that a reciprocal person, in the game they play, typically repays more than 10 MU so that a trusting investor is strictly better off than if she had not invested. Before we reveal their assessment, we ask participants to guess the prediction accuracy across experimental participants. If their guess does not deviate from the actual prediction by more than 15 percentage points, participants they earn 5 MU.
Participants in the public disclosure treatment (PubDisc) learn the ML-model's predictive assessment and know that the investor knows this prediction before he makes his decision, too, but is not bound to adhere to it. In our private disclosure treatment condition (PrivDisc), participants are aware of their predictive assessment of the ML-model, but know that we did not reveal it to the investor. We employ this second treatment to disentangle first and second order belief effects. The key feature of our design is that we ask treatment participants to make their repayment decisions and state their beliefs at both possible prediction information sets rather than only the one actually reached. Put differently, participants have to indicate their decisions twice: (i) assuming they are predicted to be a reciprocal person (who typically repays more than 10 MU) and (ii) assuming they are predicted not to be a reciprocal person (who typically repays at most 10 MU). Participants only learn about their actual assessment afterwards.
We employ this strategy method elicitation for three reasons. First, it allows us to observe counterfactuals and measure participants' beliefs and behaviors conditional on the assessment. This way, we are able to examine individual level heterogeneities conditional on the accuracy of the prediction. Second, we can use predictions generated by a pre-trained ML-model instead of a mock-up and examine aggregate level equilibrium outcomes conditional on different predictive performance levels. Third, it provides more data because we observe two outcomes per participant.
To investigate the role of knowing that it is a machine performing the predictive assessment, we run two control treatments: PrivDisc-H and PubDisc-H. These control treatments perfectly mirror the PrivDisc and PubDisc conditions, respectively, except that recipients learn that it is not proper machine learning model making the prediction, but a human expert. This way we are able to isolate any idiosyncratic effects driven by the computerized nature of algorithmic assessments.
Once participants finish stage 2, the experiment ends with a questionnaire containing several items on trust in the predictive assessment and a manipulation check (in all but the baseline treatment). These variables will serve as additional controls in our regression analyses. On the final screen we inform participants about the game outcomes in each stage, the actual assessment about themselves, and their income.