Experimental Design Details
I will recruit participants from Prolific to complete a Qualtrics survey. Participants will grade a sequence of homework submissions on the same prompt, “Should we tax companies that use artificial intelligence in place of human workers?” The Qualtrics survey mimics grading on platforms like Canvas in a blind and nonblind environment. These participants will be given contextual information on the homework question before the grading part of the survey. They can also review fictitious class notes and a grading rubric at any time during the study. Participants will be told in advance that everything is fictitious. I use AI to generate all material.
Participants in the treatment group will see a randomly generated fictitious identity attached to each submission. The fictitious identities include names and photos to mimic the information graders can see on grading platforms like Canvas. The remaining participants in the blind group will not observe any fictitious student information. Fictitious identities will be randomly assigned to each submission each participant sees in the nonblind group. The identities include a name and a photo that reflects the fictitious student's gender and race.
To identify the potential impact of blind grading on gender and racial grade gaps in education, I include four different racial groups and two genders. The fictitious students' racial and ethnic makeup is White, Black, Hispanic, and Asian. I will pool the Black, Hispanic, and Asian identities into a non-white group. There will always be an equal representation of white and non-white identities and female and male identities.
Each participant will grade eight randomly ordered submissions drawn from a larger pool of twenty submissions. Each submission is graded out of five points. Participants can choose to allocate points in half-point increments to reflect partial credit, a common practice in real life. Once a submission is graded, participants cannot go back, review, or change their previous assessments to ensure a clean identification.
Participants can look at the grading rubric or reference the class notes while they grade. On every submission, there will be two gray buttons that say “View Answer Key” and “View Lecture Notes.” If the participant chooses to reference these materials, they can click the buttons to reveal images of the grading rubric and the class notes within the page,
Once all eight submissions have been graded, I will ask participants to answer self-rating questions on their agreement with “Grading is exhausting”, “the grading rubric helped with assessing the homework,” “the class lecture notes helped with assessing the homework,” and “deciding how many points to allocate became easier over the duration of the survey.” After which they can provide their demographic information and then finish the study.
Participants will receive both a flat fee for completing this study and a potential bonus for "accurately" grading each submission. Participants will know there exists a "correct" value associated with each submission, but they will not know this value.
The "correct" values come from a survey I conducted where faculty and graduate students from the economics department at the University of Oregon blindly graded the homework submissions that will be used in this experiment. The average scores from this pre-experiment survey will act as the correct value and benchmark the bonus payments for each submission.
In this study, participants will be told that a group of teachers previously graded each submission and each submission is associated with the average score from the teachers' grades. In addition, participants will be explicitly told they can earn a bonus if they grade each submission within half a point of the pre-determined score.
The bonus amount will be shown in the instructions before starting the grading portion of the study. The bonus will range between $0.05
and $0.45 in intervals of ten cents. Participants will receive a flat fee of
$1.75 for completing the survey plus the bonus. Because I'm using Prolific, the expected time and the flat fee is set prior to the launch of the experiment.