Understanding teacher perceptions of AI grading recommendations

Last registered on December 12, 2024

Pre-Trial

Trial Information

General Information

Title
Understanding teacher perceptions of AI grading recommendations
RCT ID
AEARCTR-0015004
Initial registration date
December 10, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
December 12, 2024, 11:59 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Brookings Institution

Other Primary Investigator(s)

PI Affiliation
Monash University

Additional Trial Information

Status
In development
Start date
2024-12-16
End date
2025-12-16
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
The study involves a survey of teachers, administered online. The questionnaire will present different hypothetical scenarios of student answers to exam questions. Half of the participants will be randomly assigned a 'human first grader' questionnaire, while the other half will receive an 'artificial intelligence grading system' questionnaire. Participants will be asked to re-grade the hypothetical student answer. The study aims to (1) measure differences in teachers’ perceptions of grading recommendations when these are provided by a human first grader vs. an artificial intelligence grading system, and (2) identify the mechanisms that drive teachers’ perceptions of such recommendations.
External Link(s)

Registration Citation

Citation
Goulas, Sofoklis and Rigissa Megalokonomou. 2024. "Understanding teacher perceptions of AI grading recommendations." AEA RCT Registry. December 12. https://doi.org/10.1257/rct.15004-1.0
Experimental Details

Interventions

Intervention(s)
The study involves a survey of teachers, administered online. The questionnaire will present different hypothetical scenarios of student answers to exam questions. Half of the participants will be randomly assigned a 'human first grader' questionnaire, while the other half will receive an 'artificial intelligence grading system' questionnaire. Participants will be asked to re-grade the hypothetical student answer. The study aims to (1) measure differences in teachers’ perceptions of grading recommendations when these are provided by a human first grader vs. an artificial intelligence grading system, and (2) identify the mechanisms that drive teachers’ perceptions of such recommendations.
Intervention Start Date
2024-12-16
Intervention End Date
2025-12-16

Primary Outcomes

Primary Outcomes (end points)
The primary outcome will be the grade teachers provide in each hypothetical scenario. The goal is to measure how close teachers' grades are to those of the first grader when the first grader is a human colleague versus an artificial intelligence grading system.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The study involves a survey of teachers, administered online. The questionnaire will present different hypothetical scenarios of student answers to exam questions. Half of the participants will be randomly assigned a 'human first grader' questionnaire, while the other half will receive an 'artificial intelligence grading system' questionnaire. Participants will be asked to re-grade the hypothetical student answer. The study aims to (1) measure differences in teachers’ perceptions of grading recommendations when these are provided by a human first grader vs. an artificial intelligence grading system, and (2) identify the mechanisms that drive teachers’ perceptions of such recommendations.
Experimental Design Details
Not available
Randomization Method
Randomization by computer
Randomization Unit
Participant
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
No clustering
Sample size: planned number of observations
600 participants
Sample size (or number of clusters) by treatment arms
Two primary treatment arms, 300 participants in each arm.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number