Investigating the Retrieval Effect with and without AI Support: A Randomized Controlled Trial among College Students

Last registered on May 06, 2026

View Trial History

Pre-Trial

Trial Information

General Information

Title

Investigating the Retrieval Effect with and without AI Support: A Randomized Controlled Trial among College Students

RCT ID

AEARCTR-0018528

Initial registration date

May 01, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

May 06, 2026, 10:53 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

United States of America

Region

Blacksburg, Virginia

Primary Investigator

Name

Deepak Kumar

Affiliation

Virginia Tech

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

George Davis

PI Affiliation

Virginia Tech

Contact Investigator

Additional Trial Information

Status

On going

Start date

2026-04-14

End date

2026-05-20

Keywords

Education

Additional Keywords

JEL code(s)

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

Generative-AI assistants are increasingly used by undergraduates, but their effect on the cognitive benefits of retrieval practice is unclear. We evaluate a single-session classroom intervention in AAEC 1006 (Principles of Macroeconomics) at Virginia Tech, Spring 2026, using a three-arm individually randomized controlled trial at the student level (N = 141; retrieval practice without AI, retrieval practice with Microsoft Copilot, placebo control). The primary outcome is the score on an extra-credit graded exercise administered after the treatment under identical conditions across all arms (no AI, Respondus LockDown Browser). Secondary outcomes include time on task on the graded exercise, self-reported AI use and study habits, and midterm scores on topic-relevant and overall items as measures of persistence.

External Link(s)

Registration Citation

Citation

Davis, George and Deepak Kumar. 2026. "Investigating the Retrieval Effect with and without AI Support: A Randomized Controlled Trial among College Students." AEA RCT Registry. May 06. https://doi.org/10.1257/rct.18528-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

The study evaluates whether access to a generative-AI assistant during an in-class retrieval-practice exercise modifies the cognitive benefits of retrieval practice on a subsequent graded test without AI. The intervention is embedded within a single large-enrollment introductory economics course at Virginia Tech: AAEC 1006 (Principles of Macroeconomics), Spring 2026. The material covered under the practice and extra-credit graded exercise are delivered by the class instructor in the regular class period preceding the experiment.

The intervention is delivered in a single 50-minute class session structured as: instructions (~5 min), practice exercise under treatment conditions (~20 min), instructor lectures on an unrelated topic (~5 min), extra-credit graded exercise under identical condition for all treatment groups (~15 min). All responses are submitted through Canvas.

The treatment arms are:
1) Group A (Retrieval, No AI): Students answer practice exercise on a topic delivered by the class instructor in the regular class period preceding the experiment. Students submit their response on Canvas using the Respondus Lockdown Browser, with no access to external resources or AI tools.

2) Group B (Retrieval, With AI): Students complete the same practice exercise as Group A on Canvas without the Lockdown Browser, with permission to use generative-AI tools. The only AI tool allowed in this experiment is Microsoft Copilot signed in with Virginia Tech credentials. The AI service covered by VT’s enterprise data protection agreement (FERPA-aligned, prompts not used for public model training). Students in this arm receive a written quickstart handout explaining how to access the VT-protected version of Copilot.

3) Group C (Placebo): Students read a passage on an unrelated topic and answer a different question on Canvas using the LockDown Browser.

Post-intervention, all three arms are tested on the same extra-credit graded exercise under identical conditions: no AI, using the Respondus LockDown Browser. Score received in this graded exercise is the primary outcome.

Intervention Start Date: 2026-05-01 (09:00 AM)
Intervention Start Date: 2026-05-01 (09:50 AM)

Intervention (Hidden)

Intervention Start Date

2026-05-01

Intervention End Date

2026-05-02

Primary Outcomes

Primary Outcomes (end points)

Test score: Standardized score on the extra-credit graded exercise, administered after the practice exercise (post-interventions).

Primary Outcomes (explanation)

Test scores are standardized by subtracting the placebo control arm (Group C) mean and dividing by its standard deviation, so all treatment effects are reported in control-group standard deviations.

Secondary Outcomes

Secondary Outcomes (end points)

(i) Time on task on the graded exercise: elapsed time drawn from Canvas timestamps. standardized using the placebo control arm (Group C) mean and standard deviation.
(ii) Self-reported AI use and study habits: from the pre-intervention survey.
(iii) Midterm score on topic-relevant items: standardized score on the subset of midterm questions covering the experimental topic.
(iv) Overall midterm score: standardized total midterm score.

Secondary Outcomes (explanation)

Secondary outcomes (i) and (ii) are intended to characterize the mechanism through interventions effects take place. (iii) and (iv) measure the persistence beyond immediate effect.

Experimental Design

The evaluation is a three-arm individually randomized controlled trial conducted in a single large undergraduate economics course, AAEC 1006 (Principles of Macroeconomics), Spring 2026 at Virginia Tech. Random assignment is at the student level. The arms are:

Group A — Retrieval, No AI (Lockdown Browser, no external resources)
Group B — Retrieval, With AI (No Lockdown Browser, Microsoft Copilot permitted)
Group C — Placebo Control (Lockdown Browser, reads passage and answers a different question)

Participation eligibility is determined by enrolment in AAEC 1006 in Spring 2026 and completion of the Canvas consent quiz. Out of 154 enrolled students, 141 gave affirmative consent. All 141 were randomly assigned to three arms, each arm with 41 students each. Participation is incentivized by extra-credit points (not part of normal grading) tied to the post-treatment graded exercise.

The comparison between A vs. C: identifies the effect of retrieval practice. B vs. C: identifies the effect of AI-supported practice. B vs. A: isolates the marginal effect of AI access conditional on engaging in practice exercise.

Stratification and randomization: Students were sorted in ascending order by Test 1 score (the pre-randomization exam) and divided into four ability strata of sizes 36, 36, 36, and 33. These sizes were chosen so that each stratum is divisible by three and the four strata sum to the full sample of 141, guaranteeing an exact 47/47/47 split across arms. Within each stratum, treatment was assigned by block randomization: an arm list with each label (A, B, C) repeated in equal proportion (12 times in the first three strata and 11 times in the fourth) was randomly shuffled, and students were paired with the shuffled list element-wise. Pre-treatment Test 1 scores did not differ significantly across arms (one-way ANOVA F = 0.18, p = 0.833; all pairwise Welch t-tests p > 0.5).

Experimental Design Details

Randomization Method

Computer randomization in Python, stratified on Test 1 score, with block-of-three allocation within each stratum. The pseudo-random number generator was seeded with the execution date (20260425) and locked prior to assignment for reproducibility.

Randomization Unit

Individual student.

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

Not applicable (individually randomized).

Sample size: planned number of observations

A total of 154 students were approached to consent to participate in the experiment. Of these, 141 (91.6%) provided affirmative consent and completed the baseline survey prior to the intervention.

Sample size (or number of clusters) by treatment arms

Group A (No AI): 47 students
Group B (With AI, Microsoft Copilot): 47 students
Group C (Placebo Control): 47 students

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

0.39 SD or 5.0 percentage points

Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

IRB