Experimental Design
This is a field experiment conducted in a large principles of economics course at a large, comprehensive university. Students are given two Short Writing Assignments (SWA) and there are two types. One assignment, the second (SWA2), will be to answer a set of questions. The other assignment, the first (SWA1), will be for students to grade previous semesters students’ answers.
SWA2: Students will be given a prompt with several questions, and they will need to answer the questions. These assignments are typed. There is no required minimum or maximum length, but they are advised that they do not need more than 250 words. These SWAs are composed in essay form. They are evaluated blindly by the TAs for accuracy and writing quality.
SWA1: Students are given 20 SWAs that were submitted by students from a previous semester. They grade these 20 SWAs on Qualtrics using the same rubric and training that were actually used to grade these old SWAs. The rubric has 10 components, and each is worth from 0 to 10 points for a total of 100 possible points. Students grade each of the 20 SWAs they received using this rubric.
TA-assigned grades from previous semester: For each SWA we give students to grade, we already have the total score obtained from the grading from the previous semester. These assignments were randomly assigned to a TA grader and graders were responsible for only grading and did not interact with students in any way. Grading was double-blind such that neither student nor grader was aware of the identity of the other. TA graders were provided the same standard rubric of ten elements, all valued at ten points each. In order to obtain more precise measures of SWA grades from the previous semester, each SWA was regraded blindly by two additional randomly-assigned graders after the end of the previous semester. That is, there are 3 different blindly graded scores by 3 different TAs and the average of these 3 grades serves as a benchmark. We call this the "correct" score and students were told to match the correct score without knowing the specific calculation of it. They were only told that the correct scores were obtained blindly by several TAs and the instructor in the previous semester.
In SWA1, students are incentivized to do their best in grading these 20 SWAs by matching the "correct" scores. They are instructed that their grade will be calculated as follows:
We will compare the total score a student gave each of the 20 SWAs they graded to the correct total score on each SWA. If their total is within ± 3, they get 5 points for that SWA; if it is within ± 5, they get 4 points for that SWA; if it is within ± 7, they get 3 points for that SWA; if it is within ± 9, they get 2 points for that SWA; and, if it is within ± 11, they get 1 point for that SWA. Plus, they get 1 bonus point if their score exactly matches the correct score. We do this procedure for all 20 of their graded SWAs and their total number of points will be their score on SWA1.
On the day of assignment opening, students received instructions both on the Canvas site and in lecture on how to complete the assignment. These instructions included how to access the Qualtrics survey, an overview of the prompt and rubric, four sample submissions with writing quality scores, an in-lecture demonstration of completing the survey, and a description of how they would be evaluated. Students were invited to ask questions both in lecture and through email. The in-class demonstration and sample submissions included submissions that were excluded from the randomization pool.
The deadlines for the SWAs are as follows and everyone is granted an automatic, no-penalty one-day extension. SWA1 will open on at 11:00 am on June 3, 2024 and the deadline will be June 6 at 11:59pm. Although, if the student needs it, there is the automatic extension to 11:59pm June 7th.
The process of assigning 20 SWAs:
The randomization pool of 600 previously graded SWAs was created from a pool of 845 real submissions from Spring 2024. Only properly submitted anonymous assignments were selected. Regraded assignments and those with outlier scores for certain rubric elements were also excluded. Of the remaining 770 submissions, 600 were selected at random to serve as the randomization pool. Each assignment was randomly assigned to one of twenty "banks," which would consist of two versions of each submission: one with a randomly-chosen male-sounding first name (with replacement) and one with a randomly-chosen female-sounding name (with replacement). Last initials were also randomly assigned at the file-level (with replacement).
Qualtrics chooses a "bank" at random (without replacement) to display and then chooses a file at random within that bank to present to the student. It is therefore not possible for a student to observe the same file twice. However, a student could potentially observe the same first name with different last initials for different files.
In addition to student responses, Qualtrics also collects metadata including browser name and device type, time spent on each page, time stamps for each click on the rubric and file, and the order in which the questions appeared and answered.
Name selection: We compiled a list of 300 names to use for the random names on the short writing assignments. The list was compiled first by collecting the Spring 2024 course roster and using R's gender package to predict gender by first name. After keeping only those that were >98% or <2% probability female and appeared at least 1000 times in the R database. We also manually removed some names that were not obviously one gender.
Restricting the analysis: If students do not submit the SWA1 (grading of 20 SWAs from previous semester) by the end of the extension, they receive a zero for the score. Any late or incomplete submissions will not be included in our analysis.