Belief Updating under AI Advice: An Experiment

Last registered on January 17, 2025

Pre-Trial

Trial Information

General Information

Title
Belief Updating under AI Advice: An Experiment
RCT ID
AEARCTR-0015153
Initial registration date
January 12, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
January 17, 2025, 6:49 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Wuhan University

Other Primary Investigator(s)

PI Affiliation
Peking University HSBC Business School
PI Affiliation
Wuhan University
PI Affiliation
Wuhan University

Additional Trial Information

Status
On going
Start date
2024-12-01
End date
2025-05-31
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
The rapid integration of AI into healthcare presents a critical challenge: understanding how medical professionals incorporate AI advice into their diagnostic decision-making. While AI systems demonstrate increasing accuracy in medical diagnosis, their effectiveness ultimately depends on how humans update their beliefs in response to AI recommendations. We conduct a lab-in-the-field experiment examining belief updating processes when participants receive AI-generated medical advice. Our experimental design employs a quadratic scoring rule to elicit truthful reporting of beliefs across multiple treatment scenarios. Participants first provide prior probability estimates for treatment success, then report their expectations about the informativeness of upcoming AI advice, and finally update their beliefs after receiving AI recommendations. This three-stage elicitation, grounded in Bayesian updating frameworks, allows us to examine both the direct impact of AI advice on belief formation and the role of ex-ante trust in AI systems.
External Link(s)

Registration Citation

Citation
Khanna, Manshu et al. 2025. "Belief Updating under AI Advice: An Experiment ." AEA RCT Registry. January 17. https://doi.org/10.1257/rct.15153-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
This study examines how individuals update their beliefs when receiving AI-generated advice in medical decision-making scenarios. Participants evaluate treatment options across multiple hypothetical medical cases. For each case, participants: 1) provide their initial assessment of treatment success probabilities, 2) indicate their expectations about upcoming AI advice, and 3) make final assessments after receiving AI recommendations. The AI system used is ChatGPT-4o with a documented accuracy rate above 70%. All probability assessments are incentivized using proper scoring rules to encourage truthful reporting.
Intervention Start Date
2025-01-01
Intervention End Date
2025-03-31

Primary Outcomes

Primary Outcomes (end points)
1. Belief updating magnitude: Change in probability estimates between prior and posterior beliefs
2. Belief accuracy: Quadratic score of final probability estimates relative to true treatment outcomes
3. Second-order belief accuracy: Correlation between expected informativeness scores and actual belief updates
Primary Outcomes (explanation)
1. Belief updating magnitude is calculated as the absolute difference between prior and posterior probability assignments for each treatment option. For each scenario, we sum these differences across the five treatment options and divide by 2 to obtain a measure between 0 and 1.
2. Belief accuracy is measured using the quadratic scoring rule measuring how accrete participant's estimations are compared to the actual answer.
3. Second-order belief accuracy compares the predicted change in informativeness score with the actual change following AI advice.

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The experiment employs a 2×2 between-subjects factorial design investigating how individuals incorporate AI advice into medical decision-making. Participants evaluate treatment options across 15 hypothetical medical scenarios, providing probability estimates before and after receiving AI recommendations. The design uses incentive-compatible mechanisms to elicit truthful belief reporting. Treatments vary in the format of AI advice provided. All participants complete pre-experiment cognitive assessments and post-experiment questionnaires measuring algorithm literacy and trust.
Experimental Design Details
Not available
Randomization Method
Participants will be randomly assigned to one of four treatment conditions using a 2×2 factorial design at the session level. Randomization is implemented through the oTree experimental software platform prior to each session, ensuring balanced assignment across treatment conditions.
Randomization Unit
Primary randomization occurs at the individual participant level for the assignment and ordering of medical scenarios. Additional session-level randomization is used to assign different cohorts of participants to experimental timeslots while maintaining balanced characteristics across sessions.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
48-64 experimental sessions with 5 participants each (12-16 sessions per treatment condition)
Sample size: planned number of observations
3,600-4,800 participant-scenario observations (240-320 participants × 15 scenarios each)
Sample size (or number of clusters) by treatment arms
Four treatment conditions (2×2 factorial design):
Treatment 1 (One suggestion, No explanation): 60-80 participants (12-16 sessions)
Treatment 2 (One suggestion, With explanation): 60-80 participants (12-16 sessions)
Treatment 3 (Two suggestions, No explanation): 60-80 participants (12-16 sessions)
Treatment 4 (Two suggestions, With explanation): 60-80 participants (12-16 sessions)
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Power Calculation: Minimum Detectable Effect Sizes for Main Outcomes Below we detail the minimum detectable effect sizes (MDES) for our primary measures, given our experimental design and clustering considerations. We plan to recruit 60–80 participants per treatment arm (for a total of 240–320 participants), each providing 15 observations. This design yields 80% power at a 5% significance level, as outlined below. 1. Belief Updating Magnitude • Minimum sample (60 per treatment): MDES = 0.22 SD • Maximum sample (80 per treatment): MDES = 0.19 SD • Equivalent to a 3.5–3.0 percentage-point difference in probability updates • Accounts for within-subject correlation () and session-level clustering 2. Belief Accuracy (Quadratic Score) • Minimum sample: MDES = 0.28 SD • Maximum sample: MDES = 0.24 SD • Equivalent to a 0.17–0.14 point difference in quadratic score • Adjusts for repeated measures and session-level clustering 3. Second-Order Belief Accuracy • Minimum sample: MDES = 0.25 SD • Maximum sample: MDES = 0.21 SD • Equivalent to a 0.14–0.12 point difference in informativeness score prediction accuracy • Accounts for the nested data correlation structure These calculations assume: • Intra-cluster correlation (ICC) of 0.1 at the session level • Within-subject correlation of 0.3 across scenarios • Balanced treatment assignment • Adjustment for multiple comparisons • Conservative degrees of freedom estimation
IRB

Institutional Review Boards (IRBs)

IRB Name
Center of Behavioral and Experimental Research at Wuhan University
IRB Approval Date
2025-01-06
IRB Approval Number
EM250007