The Big Unknown: A Journey into Generative AI's Transformative Effect on Professions, starting with Medical Practitioners

Last registered on April 25, 2024

Pre-Trial

Trial Information

General Information

Title
The Big Unknown: A Journey into Generative AI's Transformative Effect on Professions, starting with Medical Practitioners
RCT ID
AEARCTR-0013399
Initial registration date
April 17, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
April 25, 2024, 11:44 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Maastricht University

Other Primary Investigator(s)

PI Affiliation
Maastricht University
PI Affiliation
Maastricht University

Additional Trial Information

Status
On going
Start date
2024-01-01
End date
2024-12-31
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
Artificial Intelligence (AI) is developing at a rapid pace and has great potential to transform many aspects of life. It has already made inroads into the healthcare sector, providing assistance to doctors and researchers in areas such as diagnostics and drug discovery. However, many of these uses of AI encompass an algorithm that is trained on specific data for one specific task. Recent developments in ‘large language models’ offer a glimpse into a future where doctors may have an AI assistant by their side at all times. Studies have already demonstrated that large language models (GPT-4 & MedPaLM) perform just as well as humans on medical licensing exams, demonstrating that these models have processed this information. Yet performing well on licensing exams does not equate to performing well in the field, and thus far little is known on how well these AI applications could perform, and to what extent they could be a useful tool to help physicians in their daily practice.
In our research project, we will assess the potential value AI can have for GPs in three countries (The Netherlands, Kenya and Indonesia) and provide insights into potential futures of work. We will conduct randomized control trials on the use of a large language model AI assistant by GPs, with the idea that such an assistant may help to improve the quality of care provided, impacting patient management; reduce healthcare costs; and decrease GP’s workloads. An AI could help a GP deal with a particularly tricky diagnosis, prepare for a consultation, or resolve complex consultations quicker. We will establish if AI can do this well, for whom, and under which conditions. Informants will be trained healthcare workers only, in the Netherlands we will use trainee and practicing GPs. They will be asked to respond to written, hypothetical, but realistic, patient scenarios and asked to diagnose medical conditions and prescribe treatment.
We will answer the following research questions:
1. To what extent and how can LLM technology help general practitioners to provide better and fast diagnosis?
2. How does the causal effect of LLMs on general practitioners’ care vary with doctors’ experience levels, informational complexity of anamnesis, condition traits and condition incidence rates?
3. How do effects vary between The Netherlands, Kenya and Indonesia, and what can we sensibly infer from cross-national differences?

External Link(s)

Registration Citation

Citation
Fregin, Marie-Christine, Mark Levels and Nicholas Rounding. 2024. "The Big Unknown: A Journey into Generative AI's Transformative Effect on Professions, starting with Medical Practitioners." AEA RCT Registry. April 25. https://doi.org/10.1257/rct.13399-1.0
Experimental Details

Interventions

Intervention(s)
Intervention Start Date
2024-08-01
Intervention End Date
2024-10-31

Primary Outcomes

Primary Outcomes (end points)
The main outcome of the experiment will be a quality of care score generated from clinical vignettes. This will assess the diagnostic reasoning, diagnosis, and patient care aspects of general practice. Alongside this, outcomes will include each aspect of the quality of care metric. We will also be analysing chat logs generated by participants who are using the AI chatbot. Finally, participants will be asked to complete a small survey following the completion of the experiment.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
Informants will be asked to complete a set of ‘clinical vignettes’ online, these are hypothetical but viable patient scenarios. The vignette consists of 5 steps: Patient History, Physical Examination, Tests Ordered, Diagnosis, and Prescribed treatment. In each step, informants will be asked to provide answers in an online Qualtrics survey. Half of the informants will be randomly assigned an AI assistant, which they will be instructed to use but not forced to. The AI assistant will be an interface developed by UM researchers that will use the OpenAI API key and GPT-4. The other half will not receive the AI assistant.
Experimental Design Details
Not available
Randomization Method
Coin flip
Randomization Unit
Individual
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
n/a
Sample size: planned number of observations
60-120 per country
Sample size (or number of clusters) by treatment arms
n/a
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number