AI Engagement, Belief Formation, and Human Capital Investment: Theory and Experimental Evidence

Last registered on April 27, 2026

View Trial History

Pre-Trial

Trial Information

General Information

Title

AI Engagement, Belief Formation, and Human Capital Investment: Theory and Experimental Evidence

RCT ID

AEARCTR-0018436

Initial registration date

April 21, 2026

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

April 27, 2026, 11:04 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Name

Tuan Nguyen

Affiliation

VNU University of Economics and Business

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Nguyen To The

PI Affiliation

VNU University of Economics and Business

Contact Investigator

PI Name

Cuong Pham Thi Kim

PI Affiliation

University of Paris Nanterre

Contact Investigator

Additional Trial Information

Status

In development

Start date

2026-06-01

End date

2027-01-31

Keywords

Behavior, Education, Welfare

Additional Keywords

JEL code(s)

I21, D83, C93, D81, J24

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

This study examines whether a structured AI-assisted advising session changes students' beliefs about their own academic ability and alters their course enrollment decisions. Students approaching a real course registration decision are randomly assigned to one of three arms: a control group (no session), a productivity arm (AI practice session without feedback), and a feedback arm (AI session with a personalized readiness assessment). The primary outcome is enrollment in a harder course track. Secondary outcomes include belief updating (posterior minus prior pass probability) and end-of-semester grades. The design tests a theoretical model in which AI functions as an endogenous information production technology: students who engage more intensively receive more precise self-assessments, but if the AI system carries an optimistic bias, greater engagement amplifies belief distortion. The experiment is powered to detect a 12-percentage-point shift in hard-course enrollment with 80% power (N ≈ 360).

External Link(s)

Registration Citation

Citation

Nguyen, Tuan, Cuong Pham Thi Kim and Nguyen To The. 2026. "AI Engagement, Belief Formation, and Human Capital Investment: Theory and Experimental Evidence." AEA RCT Registry. April 27. https://doi.org/10.1257/rct.18436-1.0

Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Experimental Details

Interventions

Intervention(s)

This experiment studies whether a structured AI-assisted advising session, conducted before course registration opens, changes how students assess their own academic readiness and alters the courses they ultimately choose to enrol in. The intervention is built around a single 30-minute session with an AI tool, administered in the week before the registration window opens for the following semester. All students complete a short baseline survey before the session and a post-session survey immediately after. The primary outcome, course enrollment, is obtained from administrative registrar records after the add/drop period closes.
Students are randomly assigned to one of three arms. The control arm receives no structured session. These students proceed through the normal registration process without any intervention from the research team. They complete the baseline and post-session surveys at the same scheduled times as students in the active arms, with the post-session belief question administered after being informed that a session took place for other students and that they are being asked to record their current course planning thinking. This design ensures that any difference between the control arm and the active arms reflects the session itself rather than the act of completing a survey.
The productivity arm attends the 30-minute session and uses an AI tool to work through a set of practice problems drawn from the beginning of the harder course curriculum. The tool is configured to assist with problem-solving, explain underlying concepts, and check the student's reasoning step by step. It is explicitly not configured to comment on the student's ability, predict her likelihood of success in the harder course, or compare her performance to any benchmark or reference group. The session therefore activates a productivity channel, in which AI assistance raises the student's immediate task performance, while holding the belief channel at zero. Students in this arm leave the session having practised relevant material with AI support but without having received any personalised assessment of their academic readiness.
The feedback arm attends an identical 30-minute session using the same AI tool and the same set of practice problems. The session format, the problem difficulty distribution, and the interface are indistinguishable from those in the productivity arm up to the final few minutes of the session. The sole difference is what happens at the end. After completing the problem set, the student receives a personalised readiness assessment generated by the AI on the basis of her observable behaviour during the session. The assessment identifies which categories of problem the student handled confidently and which she struggled with, and it provides a probabilistic estimate of her likely success in the harder course, stated numerically as a percentage chance of earning a passing grade. This assessment is the empirical realisation of the ability signal in the theoretical model. Crucially, its informativeness is endogenous to how the student engaged during the session: a student who attempted harder problems, asked more substantive questions, and iterated on targeted feedback receives a more precise and more reliable assessment than one who attempted only easy questions or interacted with the tool in a surface-level way. This feature distinguishes the feedback treatment from a conventional information intervention that delivers the same message to every student regardless of her behaviour during the session.
Comparing the feedback arm to the control arm identifies the total effect of the structured session on beliefs and enrollment, combining both the productivity and the belief-updating channels. Comparing the feedback arm to the productivity arm isolates the belief channel alone, because the session format, AI tool, problem set, and duration are held constant across these two arms and the only difference is whether the student receives the personalised readiness assessment at the end. This arm comparison is the key identification strategy for the paper's central theoretical claim that AI functions as an endogenous self-knowledge technology rather than a uniform information delivery device.
One important design feature must be stated clearly. Students in the control arm and the productivity arm are not prevented from using AI tools independently during the period between their session and their registration decision. In a contemporary university setting this restriction would be practically unenforceable, and attempting to impose it would introduce ethical complications and differential attrition that would compromise the internal validity of the study. The comparison being made is therefore not between AI access and no AI access, but between a structured, institutionally designed AI advising session and the unstructured status quo in which students make their registration decision using whatever resources they ordinarily draw on. This framing is both honest and policy-relevant, because the practical question facing university administrators is not whether students should use AI at all but whether a structured institutional session adds measurable value over and above unguided use.
The study recruits undergraduate students who are approaching a genuine registration decision in which a harder course is clearly available as an alternative to a standard option. The most natural settings are introductory quantitative sequences, such as introductory statistics, introductory microeconomics, or a foundational mathematics course, where the harder follow-on course is clearly distinguishable from the easier one, where students face genuine uncertainty about whether they are ready for the more demanding option, and where AI-assisted practice on relevant problem types is feasible within a 30-minute session. Students are recruited during the first week of the current semester so that the treatment falls naturally in the period when course selection for the following semester is actively being considered and before any registration decision has been made. Recruitment, random assignment, and all session activities are completed before the registration window opens, ensuring that the intervention precedes rather than follows the outcome it is designed to influence.
The experiment involves no deception. Students in all three arms are informed at recruitment that the study examines how different types of academic support sessions affect course planning decisions, and that their eventual enrollment and grade records will be obtained from the registrar for research purposes. Students in the feedback arm are told before the session begins that they will receive a personalised summary of their performance at the end. Informed consent is obtained from all participants prior to the baseline survey, and the study has received a positive ethics opinion from the Research Ethics Committee of the University of Economics and Business, Vietnam National University, Hanoi (Decision No. 2026-REC-UEB, April 2026).

Intervention Start Date

2026-06-01

Intervention End Date

2027-01-31

Primary Outcomes

Primary Outcomes (end points)

Hard-course enrollment indicator, posterior belief update, and end-of-semester grade in enrolled course.

Primary Outcomes (explanation)

Hard-course enrollment is a binary variable equal to 1 if the student enrolls in the harder course track and 0 if she enrolls in the standard course, measured from registrar records after the add/drop period closes for the following semester. This is the paper's primary behavioral outcome because it is the real-stakes decision that the entire theoretical model is built around: a student who correctly learns that she is more capable than she believed should shift toward the harder course, while a student who is misled by an optimistically biased assessment may shift toward the harder course for the wrong reason and fail to succeed once enrolled.
The posterior belief update is constructed as the arithmetic difference between the student's post-session and baseline responses to an identically worded question asking for the probability, stated as an integer from 0 to 100, that she would pass the harder course with a grade of B or higher. The identical wording across both time points is essential to ensure that the difference reflects a genuine revision of the student's beliefs about her own ability rather than a framing or context effect introduced by differences in question phrasing. This outcome directly operationalises the belief-updating mechanism that is the paper's central theoretical contribution and is the key mediating variable in the instrumental variable identification strategy.
The end-of-semester grade is the letter grade earned by the student in whichever course she ultimately enrolled in, converted to a standard 4.0 GPA scale and obtained from registrar administrative records at the end of the following semester. Although it is pre-registered as a secondary welfare outcome in the sense that it is collected after the primary enrollment outcome, it is treated as a primary object for evaluating the direction of welfare effects because it is the only outcome that allows the research team to distinguish between the two qualitatively different regimes the model predicts. The sign and magnitude of the feedback arm coefficient in the grade outcome regression, relative to the enrollment effect, is the empirical analogue of the overinvestment condition derived in Proposition 2.5 of the theoretical model.

Secondary Outcomes

Secondary Outcomes (end points)

Session engagement intensity, readiness assessment accuracy for feedback-arm students, and heterogeneity of belief revision by prior uncertainty level.

Secondary Outcomes (explanation)

Session engagement intensity is a composite measure of two directly observable quantities recorded during the 30-minute session. The first component is the proportion of practice problems the student attempted out of the total available, which captures breadth of engagement. The second component is the count of substantive exchanges with the AI tool, defined as exchanges in which the student posed an original question, submitted a problem attempt for evaluation, or requested targeted clarification on a specific reasoning step, as distinguished from simple navigation requests or generic prompts. These two components are recorded via platform logs if the tool is delivered through a monitored interface, or via a structured observation sheet completed by the session supervisor, and they are combined into a single engagement intensity index for use as the endogenous variable in the instrumental variable specifications. Engagement intensity is not a pure outcome in the conventional sense because it is a mediating variable on the causal pathway from treatment assignment to beliefs and enrollment, but it is pre-registered as a secondary outcome because its distribution across arms is itself a test of the model's first-stage prediction.
Readiness assessment accuracy is constructed ex post for feedback-arm students only, after semester grade data become available. For each student, the research team computes the difference between the AI-generated success probability delivered at the end of her session and the realised pass rate among observationally similar students, matched on GPA tercile, session problem-attempt rate, and the prior uncertainty proxy. This calibration error variable is the empirical realisation of the systematic bias parameter in the theoretical model and is used as the moderating variable in the bias amplification regression. A larger positive value of the calibration error indicates that the AI system was more optimistic about the student's prospects than her eventual outcome warranted, and the interaction of this variable with instrumented engagement intensity is the key test of Proposition 2.5.
Heterogeneity of belief revision by prior uncertainty tests whether the magnitude of the posterior belief update in the feedback arm is larger for students who entered the session with greater uncertainty about their own ability. Prior uncertainty is operationalised as the within-student standard deviation of baseline pass-probability beliefs elicited across four reference courses of varying difficulty, following the measurement strategy of Wiswall and Zafar (2015). This heterogeneity prediction follows from Proposition 2.2 of the model, which establishes that students with more uncertain priors engage more intensively with the AI tool, and from Proposition 2.5, which establishes that more intensive engagement with a biased system produces proportionally larger distortions.

Experimental Design

This is a three-arm individually randomized experiment in which undergraduate students approaching a real course registration decision are randomly assigned to a control condition or one of two active treatment arms before the registration window for the following semester opens. The experiment is conducted within introductory quantitative course sequences where a harder follow-on course is clearly distinguishable from a standard option and where students face genuine uncertainty about their readiness for the more demanding track.
Randomization is stratified by GPA tercile within course section using a pre-registered random seed, which ensures balance on the strongest available predictor of both session engagement and course choice and reduces residual variance in the primary outcome regression. All students complete a five-minute baseline survey before the session begins, covering their self-assessed probability of passing the harder course, risk preferences measured through a five-item Holt and Laury (2002) task and the Dohmen et al. (2011) single-item scale, prior experience with AI tools, self-assessed proficiency in the relevant subject area, and grade-following behaviour. All students complete a five-minute post-session survey immediately after the session, covering the same pass-probability question worded identically to the baseline version, a self-report of session engagement, and, for feedback-arm students only, two manipulation check questions about whether they recall the readiness estimate they received and how accurate they found it. Cumulative GPA and credits completed are obtained from administrative records without self-report.
The primary behavioral outcome is the course the student actually enrolls in, obtained from registrar records after the add/drop period closes for the following semester. The secondary welfare outcome is the grade the student earns in that course, obtained from registrar records at the end of the following semester. No outcome data are collected by the research team; both outcomes arrive from administrative sources and require no further contact with participants after the post-session survey.
The planned sample is 360 students across three equally sized arms of 120 students each. The power calculation targets the comparison between the feedback arm and control on the hard-course enrollment indicator, with a baseline enrollment rate of 35 percent, a minimum detectable effect of 12 percentage points, a two-sided significance level of 0.025 after Bonferroni correction for two arm comparisons, and 80 percent power. If single-cohort recruitment falls short of the target, the design pools across two consecutive cohorts while preserving stratified randomization within each cohort and including cohort fixed effects in all regression specifications. The full pre-analysis plan, specifying all primary hypotheses, estimating equations, covariate lists, and the falsification restriction that the first-stage coefficient on the feedback arm indicator must strictly exceed that on the productivity arm indicator, is registered before any data collection begins.

Experimental Design Details

Not available

Randomization Method

Randomization is conducted by computer in the research office before recruitment begins, using a pre-registered random seed documented in the pre-analysis plan. Stratification proceeds by first dividing students within each course section into three equally sized GPA terciles based on cumulative GPA obtained from administrative records, and then independently randomizing within each tercile-by-section cell using a fixed random seed. This procedure ensures that the three arms are balanced on GPA, which is the strongest available predictor of both session engagement and hard-course enrollment, and that the stratification structure is fully transparent and replicable from the pre-registered seed. Assignment ratios within each cell are fixed at 1:1:1 across the three arms. The assignment list is generated and sealed before any student contact occurs and is not accessible to session supervisors during the recruitment or session period.

Randomization Unit

Individual student. The design is not clustered. All students within the same course section are eligible for all three arms, and section membership is accounted for by including section fixed effects in all regression specifications. There is no group-level randomization and no situation in which students within the same section are all assigned to the same arm. The absence of clustering reflects the practical reality that the 30-minute sessions are conducted individually or in small groups that do not interact, and the treatment, whether or not to receive the readiness assessment, is delivered at the individual level with no spillover pathway to other students in the same section.

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

360 individual students. Because the unit of randomization is the individual and the design is not clustered, the number of clusters equals the number of observations.

Sample size: planned number of observations

360 students across three arms of 120 students each. If single-cohort recruitment falls short of 360 due to lower-than-expected enrollment in the target course sequences, the design pools across two consecutive cohorts up to a maximum of 720 students, with cohort fixed effects included in all specifications and the stratified randomization procedure applied independently within each cohort.

Sample size (or number of clusters) by treatment arms

120 students assigned to the control arm, 120 students assigned to the productivity arm, and 120 students assigned to the feedback arm, for a total of 360 students.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

The power calculation is anchored to the primary comparison between the feedback arm and the control arm on the hard-course enrollment indicator. The baseline hard-course enrollment rate in the target course sequences is assumed to be 35 percent, based on historical registration data from comparable introductory quantitative sequences. The minimum detectable effect is 12 percentage points, corresponding to an increase from 35 percent to 47 percent in the feedback arm, which implies a Cohen's h of approximately 0.25, in the small-to-medium range by conventional standards. The significance level is set at 0.025 two-sided after Bonferroni correction for the two primary arm comparisons, feedback versus control and productivity versus control, and power is set at 80 percent. These parameters require approximately 120 students per arm. Stratification by GPA tercile within section is expected to reduce residual variance in the enrollment outcome by approximately 10 to 15 percent relative to unstratified randomization, providing a modest efficiency gain that gives the design a small buffer against attrition or non-compliance without requiring a larger nominal sample.

Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

IRB

Institutional Review Boards (IRBs)

IRB Name

VNU University of Economics and Business

IRB Approval Date

2026-04-21

IRB Approval Number

2026-REC-UEB-05

Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information