The Big Unknown: A Journey into Generative AI's Transformative Effect on Professions, starting with Medical Practitioners

Last registered on April 25, 2024

Pre-Trial

General Information

Title

RCT ID

AEARCTR-0013399

Initial registration date

April 17, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

April 25, 2024, 11:44 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Name

Nicholas Rounding

Affiliation

Maastricht University

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Mark Levels

PI Affiliation

Maastricht University

PI Name

Marie-Christine Fregin

PI Affiliation

Maastricht University

Additional Trial Information

Status

On going

Start date

2024-01-01

End date

2024-12-31

Keywords

Health, Labor

Additional Keywords

AI, LLM, diagnosis, ChatGPT, healthcare, General Practitioners, scenario's

JEL code(s)

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

Artificial Intelligence (AI) is developing at a rapid pace and has great potential to transform many aspects of life. It has already made inroads into the healthcare sector, providing assistance to doctors and researchers in areas such as diagnostics and drug discovery. However, many of these uses of AI encompass an algorithm that is trained on specific data for one specific task. Recent developments in ‘large language models’ offer a glimpse into a future where doctors may have an AI assistant by their side at all times. Studies have already demonstrated that large language models (GPT-4 & MedPaLM) perform just as well as humans on medical licensing exams, demonstrating that these models have processed this information. Yet performing well on licensing exams does not equate to performing well in the field, and thus far little is known on how well these AI applications could perform, and to what extent they could be a useful tool to help physicians in their daily practice.
In our research project, we will assess the potential value AI can have for GPs in three countries (The Netherlands, Kenya and Indonesia) and provide insights into potential futures of work. We will conduct randomized control trials on the use of a large language model AI assistant by GPs, with the idea that such an assistant may help to improve the quality of care provided, impacting patient management; reduce healthcare costs; and decrease GP’s workloads. An AI could help a GP deal with a particularly tricky diagnosis, prepare for a consultation, or resolve complex consultations quicker. We will establish if AI can do this well, for whom, and under which conditions. Informants will be trained healthcare workers only, in the Netherlands we will use trainee and practicing GPs. They will be asked to respond to written, hypothetical, but realistic, patient scenarios and asked to diagnose medical conditions and prescribe treatment.
We will answer the following research questions:
1. To what extent and how can LLM technology help general practitioners to provide better and fast diagnosis?
2. How does the causal effect of LLMs on general practitioners’ care vary with doctors’ experience levels, informational complexity of anamnesis, condition traits and condition incidence rates?
3. How do effects vary between The Netherlands, Kenya and Indonesia, and what can we sensibly infer from cross-national differences?

External Link(s)

Registration Citation

Citation

Fregin, Marie-Christine, Mark Levels and Nicholas Rounding. 2024. "The Big Unknown: A Journey into Generative AI's Transformative Effect on Professions, starting with Medical Practitioners." AEA RCT Registry. April 25. https://doi.org/10.1257/rct.13399-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

Intervention Start Date

2024-08-01

Intervention End Date

2024-10-31

Primary Outcomes

Primary Outcomes (end points)

The main outcome of the experiment will be a quality of care score generated from clinical vignettes. This will assess the diagnostic reasoning, diagnosis, and patient care aspects of general practice. Alongside this, outcomes will include each aspect of the quality of care metric. We will also be analysing chat logs generated by participants who are using the AI chatbot. Finally, participants will be asked to complete a small survey following the completion of the experiment.

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

Secondary Outcomes (explanation)

Experimental Design

Informants will be asked to complete a set of ‘clinical vignettes’ online, these are hypothetical but viable patient scenarios. The vignette consists of 5 steps: Patient History, Physical Examination, Tests Ordered, Diagnosis, and Prescribed treatment. In each step, informants will be asked to provide answers in an online Qualtrics survey. Half of the informants will be randomly assigned an AI assistant, which they will be instructed to use but not forced to. The AI assistant will be an interface developed by UM researchers that will use the OpenAI API key and GPT-4. The other half will not receive the AI assistant.

Experimental Design Details

Not available

Randomization Method

Coin flip

Randomization Unit

Individual

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

n/a

Sample size: planned number of observations

60-120 per country

Sample size (or number of clusters) by treatment arms

n/a

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Supporting Documents and Materials

IRB

Institutional Review Boards (IRBs)

IRB Name

IRB Approval Date

IRB Approval Number

Analysis Plan