Human-AI Collaboration in Health Care – Evidence From a Field Experiment

Last registered on April 14, 2026

Pre-Trial

Trial Information

General Information

Title
Human-AI Collaboration in Health Care – Evidence From a Field Experiment
RCT ID
AEARCTR-0018339
Initial registration date
April 11, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
April 14, 2026, 9:08 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Beijing Normal University

Other Primary Investigator(s)

PI Affiliation
Beijing Normal University
PI Affiliation
Beijing Normal University
PI Affiliation
Peking Union Medical College Hospital

Additional Trial Information

Status
In development
Start date
2026-04-11
End date
2027-12-31
Secondary IDs
National Natural Science Foundation of China (Project No. 72473010)
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
We examine how different types of AI-assisted information affect human decision-making in high-stakes health care settings through a field experiment.
External Link(s)

Registration Citation

Citation
Gao, Chao et al. 2026. "Human-AI Collaboration in Health Care – Evidence From a Field Experiment." AEA RCT Registry. April 14. https://doi.org/10.1257/rct.18339-1.0
Experimental Details

Interventions

Intervention(s)
Intervention Start Date
2026-04-11
Intervention End Date
2027-04-10

Primary Outcomes

Primary Outcomes (end points)
Preoperative surgical planning: including the accuracy and consistency in identification of resected pulmonary nodules; the accuracy of operation procedure selection, and the inter-surgeon variability in surgical procedure selection.

Effort: including the number of questions physician answered, the number of sentences or words when answering patients’ questions, the time spent on preoperative surgical planning.
Primary Outcomes (explanation)
The accuracy of surgical procedure selection is defined by post-hoc majority choose by senior surgeons as the reference standard.

Secondary Outcomes

Secondary Outcomes (end points)
Another set of measuring the accuracy of operation procedure selection (if the dataset is available): whether the operation procedure meets the expert consensus, which is the gold standard.
Expected resection outcome (if the dataset is available): indicators for potential excessive resection and insufficient resection based on the surgical plan.
Intraoperative risk (if the dataset is available): whether the surgical is risk or not.
Expected surgical progress (if the dataset is available): including the operative time, surgical cost, and postoperative recovery status.
Examinations (if the dataset is available): whether physicians recommend further examination and, if so, its perceived clinical importance.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We conduct an audit study in the online health care market to study how different types of AI-assisted information impact physicians’ clinical decision-making behaviors.
Experimental Design Details
Not available
Randomization Method
We made a stratified randomization based on physicians’ gender, title and platform.
Randomization Unit
The unit of randomization is at the individual physician level.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
n/a
Sample size: planned number of observations
150 physicians
Sample size (or number of clusters) by treatment arms
50 physicians control, 50 physicians treatment 1, 50 physicians treatment 2.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
A power analysis is conducted using effect size estimates from prior studies. These studies report mean differences in post-treatment preoperative planning outcomes ranging from 0.736 to 0.900. Assuming a two-tailed significance level of 0.05, statistical power of 0.80, and an equal allocation ratio of 1:1, the analysis indicates that 28 participants are required per treatment. To ensure adequate power and provide a more conservative margin for the primary analyses, we increase the sample size to 50 participants per treatment. Across three treatment conditions, this yields a total sample of 150 participants.
IRB

Institutional Review Boards (IRBs)

IRB Name
Improving the Quality of Human-AI collaboration: Evidence from a Field Experiment in the Online Health Care Service Market
IRB Approval Date
2026-03-10
IRB Approval Number
BNU-BS-IRB 2026-051
Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information