Investigating the productivity impact of Generative AI in the public sector

Last registered on September 12, 2024

Pre-Trial

Trial Information

General Information

Title
Investigating the productivity impact of Generative AI in the public sector
RCT ID
AEARCTR-0014140
Initial registration date
August 27, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
September 12, 2024, 4:31 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation
Princeton University

Other Primary Investigator(s)

PI Affiliation

Additional Trial Information

Status
On going
Start date
2024-07-22
End date
2024-10-31
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This project proposes to examine the impact of new generative AI tools on the productivity of public sector workers. It follows in the vein of recent research examining how new generative AI tools such as ChatGPT, Bard and Copilot can improve productivity across workforces and automate a variety of common tasks done by workers, freeing up their time. This study will examine whether generative AI tools hold similar promise for government workers, particularly those who work in roles that involve the development and implementation of policy.
External Link(s)

Registration Citation

Citation
Leigh, Andrew and Christopher Wong. 2024. "Investigating the productivity impact of Generative AI in the public sector." AEA RCT Registry. September 12. https://doi.org/10.1257/rct.14140-1.0
Experimental Details

Interventions

Intervention(s)
The study will involve having public servants complete fictional tasks, with and without the assistance of generative AI tools. The tasks will be tasks commonly performed by Australian public servants and are designed to both replicate their usual work experiences and provide a meaningful baseline against which the effects (or lack thereof) of generative AI can be ascertained.
Intervention (Hidden)
The proposed experiment outlined in the document aims to assess the impact of generative AI tools on the productivity of public sector employees in Australia, specifically focusing on policy development and implementation roles. Participants will be provided with fictional tasks - Four tasks per participant, two with generative AI assistance and two without (which tasks are completed with generative AI assistance are randomised). Tasks mimic common public service activities and require both instruction and a text-based response.
Procedure: Tasks are completed in a single sitting but can be done at participants' convenience within a designated two-month period.
Each task has a time limit of 30 minutes. Tasks are graded by independent external public servants.
Intervention Start Date
2024-08-01
Intervention End Date
2024-09-30

Primary Outcomes

Primary Outcomes (end points)
The primary outcome variable is the quality of the responses to each task. These will be measured against a marking rubric by senior public servants independent to the study.
Primary Outcomes (explanation)
The marking rubric will grade responses along a sliding scale, taking into account different factors that are relevant to the quality of public service advice.

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
All participants receive a primer on using generative AI and then four tasks, each involving reading some information and responding to a request(s). Participants perform two of the tasks without AI assistance, and two are performed with AI assistance.
Experimental Design Details
Type of Study: Within-subjects design, allowing each participant to serve as their own control.
Sampling Method: Stratified random sampling to ensure varied experiences among participants
Randomization: The order of tasks (with or without AI assistance) will be randomized for each participant to control for order effects and potential biases in task performance.
Randomization Method
Done in office by a computer
Randomization Unit
Subject
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
10-20 individuals
Sample size: planned number of observations
40-80 observations
Sample size (or number of clusters) by treatment arms
10-20 individuals as controls and treatment
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
Princeton University Institutional Review Board
IRB Approval Date
2024-07-24
IRB Approval Number
17067

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials