Evaluating an AI-Powered Research Development Tool for Academic Productivity and Well-being

Last registered on April 20, 2026

Pre-Trial

Trial Information

General Information

Title
Evaluating an AI-Powered Research Development Tool for Academic Productivity and Well-being
RCT ID
AEARCTR-0017749
Initial registration date
January 22, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
January 28, 2026, 7:07 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
April 20, 2026, 5:28 PM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
KU Leuven

Other Primary Investigator(s)

PI Affiliation
London School of Economics (LSE)

Additional Trial Information

Status
In development
Start date
2026-02-01
End date
2028-06-30
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
This randomized controlled trial (RCT) evaluates the causal impact of an AI-powered Research Development Tool on the academic productivity and well-being of early-career researchers. Participants — PhD students and junior economists within five years of their doctorate — will be randomly assigned, within career-stage strata, to either a control group with access to a general-purpose AI or a treatment group with access to a comprehensive AI-driven Research Development Suite offering structured, expert-level feedback on research papers. The control group will receive access 12 months after the experiment's start date. Over a 24-month intervention period, we measure changes in externally evaluated research quality, paper submission rates, job satisfaction, and work-life balance.
Our central question is whether the productivity effects of AI access vary systematically with the researcher's level of accumulated tacit knowledge and structural advantage — specifically evaluating heterogeneity by career stage and controlling for baseline institutional prestige. We test whether AI acts as a substitute for elite networks (compressing inequality) or a complement to existing advantages (amplifying inequality).
External Link(s)

Registration Citation

Citation
Polanco-Jimenez, Jaime and Almudena Sevilla. 2026. "Evaluating an AI-Powered Research Development Tool for Academic Productivity and Well-being." AEA RCT Registry. April 20. https://doi.org/10.1257/rct.17749-2.0
Experimental Details

Interventions

Intervention(s)
This is a two-arm, parallel-group, single-blind randomized controlled trial (RCT). Participants — PhD students and junior economists within five years of their doctorate — will be randomly allocated to either a control group or a treatment group in a 1:1 ratio, using stratified randomization by career stage. The intervention will last for 24 months. Data will be collected through baseline and endline surveys and continuous behavioral logging from the AI platform. The study aims to evaluate whether the productivity effects of AI access vary systematically with the researcher's level of accumulated tacit knowledge.
Intervention Start Date
2026-05-11
Intervention End Date
2028-06-30

Primary Outcomes

Primary Outcomes (end points)
1. Change in objective academic progression milestones over the 24-month period, specifically measured by: (a) rate of desk rejections, (b) invitations to Revise and Resubmit (R&R), and (c) total number of working papers publicly circulated (e.g., NBER, CEPR, SSRN, or university working paper series)
2. Heterogeneity of the treatment effect on objective publication milestones by career stage: whether the impact of AI access on reducing desk rejections and increasing R&Rs, and the amount of research, differs systematically between PhD students and junior economists.
3. Change in the number of papers submitted for publication.
4. Change in self-reported job satisfaction.
5. Change in self-reported work-life balance satisfaction.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
1. Change in algorithmic paper quality (0-100 score). At the endline, both treatment and control group working papers will be processed blindly through the 'Editor Agent' prompt of the ResearchAI tool to calculate a standardized quality score. This serves as a mechanistic measure of algorithmic compliance rather than a measure of ultimate scientific validity.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
This is a two-arm, parallel-group, single-blind randomized controlled trial (RCT) evaluated over a 24-month period. The study investigates the causal impact of a structured, AI-powered Research Development Tool on the academic productivity of early-career researchers. Participants will be stratified by career stage (PhD students vs. junior economists within five years of their doctorate) and randomized in a 1:1 ratio into either a treatment group or a waitlist control group.

The treatment group will receive immediate access to the AI tool, which provides expert-level feedback on working papers. The waitlist control group will conduct their research as usual (which may include ad-hoc use of general-purpose LLMs) and will receive access to the treatment tool only after the 12-month intervention concludes. Our primary objective is to measure changes in real-world publication milestones (e.g., desk rejection rates, Revise & Resubmit invitations, and working paper circulation). Secondarily, we will test whether the treatment effect varies systematically by the researcher's level of accumulated tacit knowledge (career stage), identifying whether AI access compresses or amplifies existing inequalities in research output.
Experimental Design Details
Not available
Randomization Method
Individual participants will be randomly assigned to one of the two arms using a computerized stratified random assignment. Stratification is based on one binary variable: Career Stage (PhD Student vs. Junior Economist).
Randomization Unit
Individual participant

Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
Not applicable (individual-level randomization).
Sample size: planned number of observations
Our primary recruitment target is 504 participants (252 per arm), which provides 80% power to detect a moderate effect (d = 0.50) on the main ATE and 80% between treatment and career stage. This is our reference scenario given that existing literature on AI and cognitively demanding tasks (Brynjolfsson et al., 2023; Noy & Zhang, 2023) reports effects in the small-to-medium range. Gender interactions are pre-specified as exploratory: detecting a triple interaction (treatment × career stage × gender) requires approximately four times the sample needed for the main interaction.
Sample size (or number of clusters) by treatment arms
Arm 1 (Control Group - waitlist/status quo): 252 participants
Arm 2 (Treatment Group - RefereeAI Suite): 252 participants
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Power calculations assume a two-arm, individual-level randomized trial with no clustering (α=0.05, Power = 80%). Crucially, our primary hypothesis investigates whether the AI tool democratises tacit knowledge—meaning we are testing an interaction effect (Treatment × Career Stage) rather than just an Average Treatment Effect (ATE). Our baseline recruitment target of 504 participants is powered under an Optimistic Perspective. Assuming around 50/50 stratified split between PhD students and junior economists, the Minimum Detectable Effect Size (MDES) for the interaction under different enrollment scenarios is as follows: - Optimistic Target (Moderate Effect, MDES = 0.50 standard deviations): Requires 504 total participants (252 per arm). This is our baseline operational target. (Note: detecting the main ATE alone at this magnitude would require just 126 total participants). - Reference Scenario (Small-to-Moderate Effect, MDES = 0.30 SD): Requires 1,396 total participants (698 per arm). If recruitment exceeds our baseline target, this is our extended goal. - Conservative Scenario (Small Effect, MDES = 0.20 SD): Requires 3,140 total participants (1,570 per arm). Operational Constraints on Power: These calculations depend on the demographic balance of our strata. Statistical power for an interaction is constrained by the size of the smallest cell, because the variance function p(1−p) is maximized at p=0.5. If our real-world recruitment pool deviates from a 50/50 split, the required sample size increases significantly to maintain the same MDES. For example, to detect an effect of d=0.50, a 30/70 demographic split increases the required sample from 504 to 524 participants to preserve the necessary minimum cell size. To transparently pre-register our power constraints, we outline the expected sample requirements under varying degrees of demographic imbalance (d=0.5): - Optimal Split (50/50): Requires 504 total participants (252/arm). The minimum cell size is 126. This is our baseline operational target. - Mild Imbalance (60/40 or 40/60): Requires 524 total participants (262/arm). - Moderate Imbalance (70/30 or 30/70): Requires 598 total participants (299/arm). - Severe Imbalance (80/20 or 20/80): Requires 786 total participants (393/arm). - Extreme Imbalance (90/10 or 10/90): Requires 1,396 total participants (698/arm). If the true effect of the AI tool is smaller, the required sample sizes increase drastically (d=0.3). - Optimal Split (50/50): Requires 1,396 total participants (698/arm). - Moderate Imbalance (70/30 or 30/70): Requires 1,662 total participants (831/arm). - Severe Imbalance (80/20 or 20/80): Requires 2,182 total participants (1,091/arm).
IRB

Institutional Review Boards (IRBs)

IRB Name
The London School of Economics and Political Science
IRB Approval Date
2026-01-27
IRB Approval Number
686457