Abstract
Artificial Intelligence (AI) is developing at a rapid pace and has great potential to transform many aspects of life. It has already made inroads into the healthcare sector, providing assistance to doctors and researchers in areas such as diagnostics and drug discovery. However, many of these uses of AI encompass an algorithm that is trained on specific data for one specific task. Recent developments in ‘large language models’ offer a glimpse into a future where doctors may have an AI assistant by their side at all times. Studies have already demonstrated that large language models (GPT-4 & MedPaLM) perform just as well as humans on medical licensing exams, demonstrating that these models have processed this information. Yet performing well on licensing exams does not equate to performing well in the field, and thus far little is known on how well these AI applications could perform, and to what extent they could be a useful tool to help physicians in their daily practice.
In our research project, we will assess the potential value AI can have for GPs in three countries (The Netherlands, Kenya and Indonesia) and provide insights into potential futures of work. We will conduct randomized control trials on the use of a large language model AI assistant by GPs, with the idea that such an assistant may help to improve the quality of care provided, impacting patient management; reduce healthcare costs; and decrease GP’s workloads. An AI could help a GP deal with a particularly tricky diagnosis, prepare for a consultation, or resolve complex consultations quicker. We will establish if AI can do this well, for whom, and under which conditions. Informants will be trained healthcare workers only, in the Netherlands we will use trainee and practicing GPs. They will be asked to respond to written, hypothetical, but realistic, patient scenarios and asked to diagnose medical conditions and prescribe treatment.
We will answer the following research questions:
1. To what extent and how can LLM technology help general practitioners to provide better and fast diagnosis?
2. How does the causal effect of LLMs on general practitioners’ care vary with doctors’ experience levels, informational complexity of anamnesis, condition traits and condition incidence rates?
3. How do effects vary between The Netherlands, Kenya and Indonesia, and what can we sensibly infer from cross-national differences?