Optimising Signals in Human-AI Interaction - Experiment 1

Last registered on October 21, 2024

View Trial History

Pre-Trial

Trial Information

General Information

Title

Optimising Signals in Human-AI Interaction - Experiment 1

RCT ID

AEARCTR-0013716

Initial registration date

May 30, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

May 30, 2024, 5:51 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated

October 21, 2024, 2:32 PM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Country

United States of America

Region

Primary Investigator

Name

Ruru (Juan Ru) Hoong

Affiliation

Harvard University

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Bnaya Dreyfuss

PI Affiliation

Contact Investigator

Additional Trial Information

Status

In development

Start date

2024-05-30

End date

2024-12-31

Keywords

Finance & Microfinance, Firms & Productivity, Labor

Additional Keywords

AI, decision-making

JEL code(s)

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

Artificial intelligence (AI) has caught pace with — and in some contexts even surpassed — humans in the ability to make predictions from data, purporting to improve decision-making. However, in cases where humans are still responsible for the final decision, biases in probabilistic reasoning can render even informative AI predictions detrimental to decision-making outcomes. Using a randomised experiment with loan underwriters, we show that the provision of optimised AI signals can improve overall decision-making in spite of information loss.

External Link(s)

Registration Citation

Citation

Hoong, Ruru (Juan Ru) and Bnaya Dreyfuss. 2024. "Optimising Signals in Human-AI Interaction - Experiment 1." AEA RCT Registry. October 21. https://doi.org/10.1257/rct.13716-6.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

Participants see different types of AI signals to aid their decision-making.

Intervention (Hidden)

In a cross-randomised design, we vary the type of AI signal that is shown {binary, probability, none}, with sub-treatments varying the thresholds at which the binary signals are shown, and (ii) potentially, whether or not participants are put through "loan underwriter training". [we omit this latter treatment for those who are already experienced loan underwriters].

We also vary at a question-level (i) whether the AI signal or real loan application is shown first, and (ii) whether the human posterior is elicited after seeing just the loan application.

See the pre-analysis plan for more info.

Intervention Start Date

2024-10-02

Intervention End Date

2024-10-31

Primary Outcomes

Primary Outcomes (end points)

The main dependent variable we will look at is decision accuracy (i.e. whether the decision to approve/deny the loan was ex-post correct).

Our main specification will use data from all rounds in the experiment. Though we will not be as well powered, we will later also perform a robustness check on just first round decisions.

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

Secondary Outcomes (explanation)

Experimental Design

We recruit 200 loan underwriters/officers/processors, and other participants to make loan application decisions.

Experimental Design Details

Prolific participants with statistics/finance/accounting background are recruited. Loan underwriters/officers are recruited on online work platforms to participate in our study. We present binary prediction policy problems: approve the loan to applicants expected to make timely payments, and deny the loan otherwise.
There is a practice round where the loan application shown with no AI signal.
In the main rounds, incentives are structured to balance the cost between false positives and negatives: a bonus of $0.10 is awarded for each loan correctly denied and for each loan correctly approved. Participants see both a subset of a real loan application, as well as a signal from an AI model trained on a loan repayment dataset with similar loans from Home Credit. We vary the intervention as described above. All participants will go through all order-randomized stages. In each stage, they will make ten approve/deny decisions loan decisions, as well as providing the percent chance (posterior) that the loan application will be repaid.
Lastly, we have a demand elicitation where we elicit demand for binary, probability, or No AI signals. A random row of the multiple price list is implemented for two more loan decisions. We also elicit participants' costs of FP/FN through a hypothetical question.

For the non-loan officer/underwriter pool, we plan to randomise half of them into the loan training treatment. We can tell if the training is successful in shifting their (perceived) private information, which we will measure by their decision boundary graphs.

See the pre-analysis plan for more info.

Randomization Method

Randomisation done through Otree programme

Randomization Unit

Individual

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

200 loan officers/processors/underwriters).

Sample size: planned number of observations

We will aim to recruit 200 loan officers/processors/underwriters, unless otherwise limited by our recruitment (i.e., 2 weeks go by without any new loan officers/processors/underwriters). We will filter out subjects in our analysis based on attention and time checks.

Sample size (or number of clusters) by treatment arms

200 in each arm. Each subject will see every treatment arm in randomised order.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

See PAP.

Supporting Documents and Materials

IRB