Financial Incentives and Student Achievement: Evidence from Randomized Trials

Last registered on March 28, 2017

Pre-Trial

Trial Information

General Information

Title
Financial Incentives and Student Achievement: Evidence from Randomized Trials
RCT ID
AEARCTR-0001943
Initial registration date
March 27, 2017

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
March 28, 2017, 3:12 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Primary Investigator

Affiliation
Harvard University

Other Primary Investigator(s)

Additional Trial Information

Status
Completed
Start date
2007-07-16
End date
2011-11-02
Secondary IDs
Abstract
This paper describes a series of school-based field experiments in over 200 urban schools across three cities designed to better understand the impact of financial incentives on student achievement. In Dallas, students were paid to read books. In New York, students were rewarded for performance on interim assessments. In Chicago, students were paid for classroom grades. Researchers estimate that the impact of financial incentives on state test scores is statistically zero, in each city. Due to a lack of power, however, researchers cannot rule out the possibility of effect sizes that would have positive returns on investment. The only statistically significant effect is on English speaking students in Dallas. The paper concludes with a speculative discussion of what might account for inter-city differences in estimated treatment effects.
External Link(s)

Registration Citation

Citation
Fryer, Roland. 2017. "Financial Incentives and Student Achievement: Evidence from Randomized Trials." AEA RCT Registry. March 28. https://doi.org/10.1257/rct.1943-1.0
Former Citation
Fryer, Roland. 2017. "Financial Incentives and Student Achievement: Evidence from Randomized Trials." AEA RCT Registry. March 28. https://www.socialscienceregistry.org/trials/1943/history/15531
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
In the United States, in an effort to increase achievement and narrow differences between racial groups, school districts have become laboratories for reforms. One potentially cost-effective strategy is providing short-term financial incentives for students to achieve or exhibit certain behaviors correlated with student achievement. Providing such incentives could have one of three possible effects. If students lack sufficient motivation, dramatically discount the future, or lack accurate information on the returns to schooling to exert optimal effort, providing incentives for achievement will yield increases in student performance. If students lack the structural resources or knowledge to convert effort to measurable achievement or if the production function has important complementarities out of their control (e.g., effective teachers, engaged parents, or social interactions), then incentives will have little impact. Third, financial rewards for students may undermine intrinsic motivation and lead to negative outcomes. Which one of the above effects – investment incentives, structural inequalities, or intrinsic motivation –dominates is unknown.

In the 2007-2008 and 2008-2009 school years, researchers conducted incentive experiments in public schools in Chicago (40 schools comprising 10,628 9th-grade students), Dallas (42 schools comprising 4,008 2nd-grade students), and New York City (121 schools comprising 16,449 4th- & 7th-grade students) – three prototypically low-performing urban school districts – distributing a total of $9.4 million (figures include treatment and control). All treatments were school-based randomized trials, which varied from city to city on several dimensions: what was rewarded, how often students were given incentives, the grade levels that participated, and the magnitude of the rewards. The key features of each experiment consisted of monetary payments to students (directly deposited into bank accounts opened for each student or paid by check to the student) for performance in school according to a simple incentive scheme. There was a coordinated implementation effort among twenty project managers to ensure that students, parents, teachers, and key school staff understood the particulars of each program; that the program was implemented with high fidelity; and that payments were distributed on time and accurately.

The incentive schemes were designed to be both simple and politically feasible. In Dallas, researchers paid second graders $2 per book to read and pass a short quiz to confirm they read it; rewards were distributed three times per year. In NYC, researchers paid fourth and seventh grade students for performance on a series of ten interim assessments currently administered by the NYC Department of Education to all students; rewards were distributed five times per year. In Chicago, researchers paid ninth graders every five weeks for grades in five core courses.
Intervention Start Date
2007-09-01
Intervention End Date
2009-06-16

Primary Outcomes

Primary Outcomes (end points)
Dallas (2nd-grade students): Iowa Tests of Basic Skills (ITBS)/Logramos (test for bilingual students) reading and math scores
New York City (4th- and 7th-grade students): New York state assessment ELA and math scores
Chicago (9th-grade students): PLAN English and math scores
Primary Outcomes (explanation)
state tests

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
Experiments were conducted in 203 schools across three cities (27,000 students). All experiments had a similar implementation plan. First, researchers garnered support from the district superintendent. Second, a letter was sent to principals of schools that served the desired grade levels. Third, researchers met with principals to discuss the details of the programs. After principals were given information about the experiment, there was a brief sign-up period – typically five to ten days. Schools that signed up to participate serve as the basis for randomization. All randomization was done at the school level. Students had to have a parental consent form signed. Students received their first payments the second week of October and their last payment was disseminated over the summer. All experiments lasted one academic year.

In Dallas, 42 schools signed up to participate in the experiment (3,718 second graders with test scores), and researchers randomly chose twenty-one of those schools to be treated. Participating schools received $1,500 to lower the cost of implementation. Upon finishing a book, each student took an Accelerated Reader (AR) computer-based comprehension quiz. The student earned a $2/book reward for scoring eighty percent or better on the book quiz for up to 20 books per semester. Students were allowed to select and read books of their choice at the appropriate reading level and at their leisure, not as a classroom assignment. The books came from the existing stock available at their school. Quizzes were taken in the library on a computer and students were only allowed one chance to take a quiz. An important caveat of the Dallas experiment is that researchers combine Accelerated Reader (a known software program) with the use of incentives. If the Accelerated Reader program has an independent (positive) effect on student achievement, the impact of incentives would be overstated. Three times a year (twice in the fall and once in the spring) teachers in the program tallied the total amount of incentive dollars earned by each student based on the number of passing quiz scores. A check was then written to each student for the total amount of incentive dollars earned.

In New York City, 121 schools signed up to participate; researchers randomly chose 63 schools (thirty-three fourth grades and thirty-one seventh grades) to be treated (15,883 fourth and seventh graders with test scores). A participating school received $2,500 if eighty percent of eligible students were signed up to participate and if the school had administered the first four assessments. The school received another $2,500 later in the year if eighty percent of students were signed up and if the school had administered all six assessments. Students were given incentives for their performance on six computerized exams (three in reading and three in math) and four predictive assessments that were pencil and paper tests. For each test, fourth graders earned $5 for completing the exam and $25 for a perfect score. The incentive scheme was strictly linear – each marginal increase in score was associated with a constant marginal benefit. The magnitude of the incentive was doubled for seventh graders – $10 for completing each exam and $50 for a perfect score – yielding the potential to earn $500 in a school year. Approximately sixty-six percent of students opened student savings accounts with Washington Mutual as part of the experiment and money was directly deposited into these accounts. Certificates were distributed in school to make the earnings public. Students who did not participate because they did not return consent forms took identical exams but were not paid.

In Chicago, of the 70 schools opted to participate, researchers selected forty smallest schools (7,655 ninth graders) and randomly selected twenty to treat, with the other twenty representing control. Participating schools received up to $1,500 to provide a bonus for the school liaison who served as the main contact for implementation team. Students in Chicago were given grade incentives in five courses: English, mathematics, science, social science, and gym. Researchers rewarded each student with $50 for A, $35 for B, $20 for C, and $0 for D. If a student failed a core course, s/he received $0 for that course and temporarily “lost” all other monies earned from other courses in the grading period. Once the student made up the failing grade through credit recovery, night school, or summer school, all the money “lost” was reimbursed. Students could earn $250 every five weeks and $2,000 per year. Half of the rewards were given immediately after the five-week grading periods ended and the other half was supposed to be held in an account and given in a lump sum conditional on high school graduation.

Researchers collected administrative and survey data. The data include information on each student’s first and last name, birth date, address, race, gender, free lunch eligibility, attendance, matriculation with course grades, special education status, and English Language Learner (ELL) status. In Dallas and New York, researchers are able to link students to their classroom teachers. New York City administrative files contain teacher value-added data for teachers in grades four through eight, as well as data on student suspensions and behavioral incidents.

The main outcome variable is an achievement test unique to each city. All Chicago tenth graders take the PLAN assessment, an ACT college-readiness exam, in October. In May of every school year, students in regular classes in Dallas elementary schools take the Iowa Tests of Basic Skills (ITBS) if they are in kindergarten, first grade, or second grade. Students in bilingual classes in Dallas take a different exam, called Logramos. In New York City, McGraw-Hill mathematics and English Language Arts tests are administered each winter to students in grades three through eight.

Researchers use a parsimonious set of controls to aid in precision and to correct for any potential imbalance between treatment and control. The most important controls are reading and math achievement test scores from the previous two years; they are included in all regressions along with their squares. Previous years’ test scores are available for most students who were in the district in previous years. Researchers also include an indicator variable that is one if a student is missing a test score from a previous year and zero otherwise. Other individual-level controls include a mutually exclusive and collectively exhaustive set of race dummies pulled from each school district’s administrative files, indicators for free lunch eligibility, special education status, and whether a student is an English Language Learner. Researchers also construct three school-level control variables: percent of student body that is black, percent Hispanic, and percent free lunch eligible.

To supplement each district’s administrative data, researchers administered a survey in each of the three school districts. These surveys include basic demographics of each student such as family structure and parental education, time use, effort and behavior in school, and the Intrinsic Motivation Inventory.

Researchers assess the effects of incentives on state test scores – an indirect and non-incentivized outcome. They estimate intent-to-treat (ITT) effects using a regression equation, which includes an indicator for assignment to treatment, a vector of baseline covariates measured at the individual level, and school-level variables; parsimonious set of controls. All these variables are measures pre-treatment. ITT provides an estimate of the impact of being offered a chance to participate in a financial incentive program. All student mobility between schools after random assignment is ignored. Researchers only include students who were in treatment and control schools as of October 1 in the year of treatment. For most districts, school begins in early September; the first student payments were distributed mid-October. All standard errors are clustered at the school level. To ensure that they do not overfit the model, researchers also estimate treatment effects with school-level regressions.

Researchers also assess effect of incentives on outcomes for which students where given direct incentives (i.e., books in Dallas, predictive tests in NYC, and report card grades in Chicago), their self-reported effort, and intrinsic motivation. They also assess treatment effects for subsamples – gender, race/ethnicity, previous year’s test score, an income proxy, whether a student is an English language learner, and, in Dallas only, whether or not a student took the English or Spanish test. All categories are mutually exclusive and collectively exhaustive. Standard errors are clustered at the school level.
Experimental Design Details
Randomization Method
Suppose there are X schools that are interested in participating and we aim to have a treatment group of size Y. Then, there are X choose Y potential treatment-control designations. From this set of possibilities – 2.113×1041 in New York – we randomly selected 10,000 treatment-control designations and estimated equations identical to:
(1) treatment_s = α + Xsβ + ε_s,
where the dependent variable takes on the value of one for all treatment schools and s represents data measured at the school level which were available at the time of randomization. We then selected the randomization that minimized the maximum z-score from equation (1).
Randomization Unit
School
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
Dallas: 42 schools (2nd grades)
NYC: 121 schools (4th & 7th grades)
Chicago: 40 schools (9th grades)
Sample size: planned number of observations
Dallas: 4,008 students (2nd grade) NYC: 16,449 students (4th & 7th grades) Chicago: 10,628 students (9th grade)
Sample size (or number of clusters) by treatment arms
Dallas: 21 treatment/21 control schools
NYC: 63 treatment (33 4th grades and 31 7th grades)/58 control schools
Chicago: 20 treatment/20 control schools
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
Harvard's Committee on the Use of Human Subjects/NBER IRB
IRB Approval Date
2008-09-18
IRB Approval Number
F16545, F14653, IRB 08-043

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
Yes
Intervention Completion Date
June 16, 2009, 12:00 +00:00
Data Collection Complete
Yes
Data Collection Completion Date
October 07, 2009, 12:00 +00:00
Final Sample Size: Number of Clusters (Unit of Randomization)
Dallas: 42 schools (2nd grades)
NYC: 121 schools (4th & 7th grades)
Chicago: 40 schools (9th grades)
Was attrition correlated with treatment status?
No
Final Sample Size: Total Number of Observations
Students with test scores:
Dallas: 3,718 2nd-gade students
NYC: 6,582 4th-grade students & 9,301 7th-grade students (15,883 total)
Chicago: 7,655 9th-grade students
Final Sample Size (or Number of Clusters) by Treatment Arms
Dallas: 21 treatment/21 control schools; 1,777 treatment/1,941 control 2nd grade students NYC: 63 treatment (33 4th grades and 31 7th grades)/58 control schools; 3,348 treatment/3,234 4th-grade students & 4,605 treatment/4,696 control 7th-grade students Chicago: 20 treatment/20 control schools; 3,275 treatment/4,380 control 9th-grade students
Data Publication

Data Publication

Is public data available?
No

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Program Files

Program Files
No
Reports, Papers & Other Materials

Relevant Paper(s)

Abstract
Online Appendix to Financial Incentives and Student Achievement: Evidence from Randomized Trials
Citation
Fryer, Roland. 2011. "Online Appendix to Financial Incentives and Student Achievement: Evidence from Randomized Trials." Quarterly Journal of Economics. 126(4):1755-1798.
Abstract
This paper describes a series of school-based field experiments in over 200 urban schools across three cities designed to better understand the impact of financial incentives on student achievement. In Dallas, students were paid to read books. In New York, students were rewarded for performance on interim assessments. In Chicago, students were paid for classroom grades. Researchers estimate that the impact of financial incentives on state test scores is statistically zero, in each city. Due to a lack of power, however, researchers cannot rule out the possibility of effect sizes that would have positive returns on investment. The only statistically significant effect is on English speaking students in Dallas. The paper concludes with a speculative discussion of what might account for inter-city differences in estimated treatment effects.
Citation
Fryer, Roland. 2011. "Financial Incentives and Student Achievement: Evidence from Randomized Trials." Quarterly Journal of Economics. 126(4):1755-1798.

Reports & Other Materials