Promoting Early Grade Reading & Numeracy in Tanzania: KiuFunza II

Last registered on May 04, 2016

Pre-Trial

Trial Information

General Information

Title
Promoting Early Grade Reading & Numeracy in Tanzania: KiuFunza II
RCT ID
AEARCTR-0001009
Initial registration date
May 04, 2016

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
May 04, 2016, 12:35 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Primary Investigator

Affiliation
University of California, San Diego

Other Primary Investigator(s)

PI Affiliation
University of California, San Diego
PI Affiliation
University of Virginia
PI Affiliation
University of California, San Diego

Additional Trial Information

Status
On going
Start date
2015-01-01
End date
2016-12-31
Secondary IDs
Abstract
Overall student learning levels remain low across East Africa, despite a decade plus of major reforms and significant new investments in public education. In recent years, teacher performance pay has received an increasing amount attention as a means of improving student learning. Yet the current evidence on teacher performance pay is at best mixed, with some studies finding large positive effects, and others finding little or no effects at all. However, these studies are not directly comparable as they are performed in different contexts, with different incentive structures, and different budgets.

In the KiuFunza II RCT we evaluate two different teacher incentive programs implemented in 180 randomly selected government primary schools across ten districts in Tanzania, focusing on English, Kiswahili, and Math in Grades 1, 2, and 3. In the first arm, “levels”, teachers are paid a bonus based on the number of skills within a given subject a student is able to master. In the second arm, “gains”, students are placed in ability groups at the beginning of the year based on starting test scores. Teachers are then rewarded for their students’ improvements within their specific ability groups, regardless of initial learning levels.

The goal of the RCT is to determine whether there is clear evidence that teacher incentive schemes and pay for performance programs are effective at improving learning outcomes.
External Link(s)

Registration Citation

Citation
Mbiti, Isaac et al. 2016. "Promoting Early Grade Reading & Numeracy in Tanzania: KiuFunza II." AEA RCT Registry. May 04. https://doi.org/10.1257/rct.1009-1.0
Former Citation
Mbiti, Isaac et al. 2016. "Promoting Early Grade Reading & Numeracy in Tanzania: KiuFunza II." AEA RCT Registry. May 04. https://www.socialscienceregistry.org/trials/1009/history/8095
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
Levels: In the levels-based cash-on-delivery (COD) program, teachers are paid a bonus based on the number of skills their students are able to master within subject tests. That is, even if a student is unable to pass the entire Math test, if they are able to demonstrate mastery of addition and subtraction (but still fail multiplication), the teacher will earn a small bonus.

Whereas under more simplistic COD programs a teacher only receives payment if and only if their student mastered all skills within each subject test to pass the test, the levels incentive payment has now been disaggregated to reward teachers whose students will probably never pass the full tests, but can still improve learning. We are, in short, incentivizing teachers to begin teaching low ability children and not just those who are at the margin of passing the complete tests.

6 schools in each of the 10 districts have been randomly assigned to the levels treatment group.

Gains: While also a program designed to incentivize teacher performance through bonus payments, the gains intervention is designed differently from the levels program described above. Schools receiving this intervention will have their teachers compete in a national tournament for a total “prize” money. Teachers whose students have learned the most during the year will get a larger bonus. In order to make this competition as fair as possible so as to incentivize teachers to focus on all students, regardless of their ability at the start of the competition, each student is grouped into an ability group based on the test scores of Twaweza 2014 endline tests. Students not tested by Twaweza are grouped separately, and Grade 1 students are grouped by incorporating characteristics of their schools.

After endline testing in 2015, test scores will be used to rank each student within their ability group. Teachers whose students rank highest at endline will get paid the greatest amount, while teachers whose students rank lowest (or at the bottom) will receive nothing. Teachers who rank in between will earn an amount depending on the ranking of their students within each ability group.

This is the most high-powered design, according to education experts, in trying to incentivize teachers to exert greater effort leading to improvements in learning. As we want to reward teachers for improvements, we need to have two test results for each student: one at the end of last year, one at the end of this year. The difference in learning that we measure will determine how much the teacher earns.

6 schools in each of the 10 districts have been randomly assigned to the gains treatment group.

Control: Schools in the control group do not receive any type of teacher incentive program. 6 schools in each of the 10 districts have been randomly assigned to control.
Intervention Start Date
2015-01-01
Intervention End Date
2016-12-31

Primary Outcomes

Primary Outcomes (end points)
The main outcome variables are students test scores, as a proxy for student learning. Assuming we are able to establish that the treatments had an impact in students test scores, secondary outcome variables would try to establish spillover effects, the mechanisms behind the main treatment effects and how the treatment effects vary across student, household, teacher and school characteristics.

A sample of 40 students per school (10 students from Grades 1, 2, 3, and 4) are to be tested in all treatment schools and in 100 control schools. Students will be tested at the end of the first year and at the end of the second year. Grade 1 students will be tested at the beginning of the first and second years, to provide baseline scores to evaluate their initial learning levels. The information from these tests will be used to calculate standardized test scores and compare achievement for children across treatment groups.

We also collect detailed student information (e.g. age and gender); detailed school information (e.g. facilities, management practices, and head teacher characteristics), detailed teacher information (e.g. education, age, experience, and self reported time use), and detailed household information (e.g. parental engagement in child's education, parents own education, household composition, and assets owned by the household). The information from our household, teacher, and schools surveys can be used to identify the mechanisms through which the treatment affects test scores. For example, we can look at the changes in learning outcomes in non-incentivized subjects; how teachers spend their time in school; how schools allocate funding (e.g. textbooks, scholarships, meals, etc.); whether schools increase the hours taught in the incentivized subjects; and whether households become more or less engaged in the child's education after the intervention. Using the baseline survey data we can study how the treatment effects differ across student, household, teacher, and school characteristics.
Primary Outcomes (explanation)
The main outcomes variables will be standardized test scores. First, we will construct a standardized test score for each subject in each grade, by subtracting the mean and dividing by the standard deviation of the test scores in the control group. Once we have subject-grade standardized test scores, we will add these up across grades and the re-normalize (dividing by the standard deviation of the test scores in the control group); this will yield subject standardized test score.

For some analysis we will also aggregate test scores across subjects by summing them and the re-normalizing (dividing by the standard deviation of the test scores in the control group).

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
This evaluation was conducted in 180 government primary schools in 10 randomly selected districts across Tanzania. 18 schools, of the initial 35 from the first phase of the project, were randomly allocated to one of the three program groups: levels (6 per district), gains (6 per district), or control (6 per district). Per the request of Twaweza, the implementing partner, the probability of assignment into different treatments was a function of the schools’ previous treatment status during KiuFunza I. (See the pre-analysis plan for further details on the allocation process.) In each sampled district, 6 schools were assigned to the control group and 6 schools were assigned to each treatment group, levels and gains.

The aims of the two treatment groups are as follows:
1. The levels teacher incentive intervention seeks to:
a. test the extent to which paying for performance improves basic literacy and numeracy; and
b. understand the effects of the intervention on public debate and citizen engagement.

2. The gains teacher incentive intervention seeks to:
a. test the extent to which paying for performance improves basic literacy and numeracy;
b. test how efficiently designed incentive programs enhance teacher motivation; and
c. understand the effects of the intervention on public debate and citizen engagement.
Experimental Design Details
Randomization Method
Randomization done in office using Stata.
Randomization Unit
Random sampling: District level. Random treatment assignment: School level
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
180 schools
Sample size: planned number of observations
180 schools; 7,200 students (40 per school); 1,800 teachers (8-12 per school); 2,700 households (15 per school)
Sample size (or number of clusters) by treatment arms
60 schools Treatment 1 (Levels)
60 schools Treatment 2 (Gains)
60 schools Control (no treatment)
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
We use Optimal Design Software for power calculations, as well as our own calculation that allows us to include additional details on the design. We originally assumed an intra-cluster correlation of 0.1 (that is the intra-cluster correlation for value added) and that 30% of the variation could be explained by baseline test scores and other covariates (such as age, gender, school and teacher characteristics) and district fix effects. Our main outcomes (the effect of each treatment arm) have a total of 120 clusters (60 controls schools and 60 treatment schools). Additionally, the difference between the treatment effects in both treatment arms also has 120 clusters (60 "gains" schools and 60 "levels" schools). With 120 clusters and a significance level of 5% we have: a minimum detectable effect size of 0.17 with power of 80%, a minimum detectable effect size of 0.2 with power of 90%, and a minimum detectable effect size of 0.22 with power of 95%. If we assume a higher intra-cluster correlation (0.3) with 120 clusters and a significance level of 5% we have: a minimum detectable effect size of 0.28 with power of 80%, a minimum detectable effect size of 0.33 with power of 90%, and a minimum detectable effect size of 0.37 with power of 95%. However, based on data from these same schools in previous years, we know that the intra-cluster correlation is 0.15 for Kiswahili, 0.06 for English, and 0.14 for Math. The proportion of the variation that can be explained by baseline test scores and other covariates (such as age, gender, school and teacher characteristics) and district fix effects is 40% for Kiswahili, 36% for English, and 37% for Math. Using the most conservative estimates (0.15 intra-cluster correlation and 36% of the variance explained by baseline characteristics) we have the following numbers. With 120 clusters and a significance level of 5% we have: a minimum detectable effect size of 0.2 with power of 80%, a minimum detectable effect size of 0.24 with power of 90%, and a minimum detectable effect size of 0.26 with power of 95%. See the attached document for more detailed calculations.
Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
IRB

Institutional Review Boards (IRBs)

IRB Name
Institutional Review Board for the Social & Behavioral Sciences, University of Virginia
IRB Approval Date
2015-03-16
IRB Approval Number
2014031800
IRB Name
Innovations for Poverty Action IRB
IRB Approval Date
2015-10-09
IRB Approval Number
1099
Analysis Plan

Analysis Plan Documents

KiuFunza II: Pre-Analysis Plan

MD5: 0b1ee7a95c18ed9356e803d266079d5d

SHA1: 072f0a476dafadd1506e9aa04f43c8b9b9d5370a

Uploaded At: May 04, 2016

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials