x

We are happy to announce that all trial registrations will now be issued DOIs (digital object identifiers). For more information, see here.
Improving Accountability in Khyber Pakhtunkhwa's Schools
Last registered on March 30, 2018

Pre-Trial

Trial Information
General Information
Title
Improving Accountability in Khyber Pakhtunkhwa's Schools
RCT ID
AEARCTR-0002815
Initial registration date
March 30, 2018
Last updated
March 30, 2018 5:35 PM EDT
Location(s)
Primary Investigator
Affiliation
University of Oxford
Other Primary Investigator(s)
PI Affiliation
International Food Policy Research Institute (IFPRI)
PI Affiliation
International Food Policy Research Institute (IFPRI)
PI Affiliation
Lahore University of Management Sciences
PI Affiliation
Consortium for Development Policy Research
Additional Trial Information
Status
On going
Start date
2017-09-18
End date
2018-09-30
Secondary IDs
Abstract
Pakistan has low student learning levels and educator motivation, and accountability is a pervasive problem. The Khyber Pakhtunkhwa (KP) Elementary and Secondary Education (E&SE) Department wants to improve learning by increasing teacher accountability through the reform of two dysfunctional accountability systems. The first is the system of ‘annual confidential reports’ for teachers. These evaluations were conducted at the end of the calendar year (overlapping two school years), performed by head teachers who find it difficult to criticize colleagues, and did not include teaching-specific measures known to improve learning. Moreover, although in theory evaluation scores should be tied to promotion, in practice promotions have been based on seniority, educational qualifications, and/or political connections. The second source of dysfunction is the system of school inspections. Head teachers face school inspections, but they are irregular, unstructured, and results are often unreported. Thus, neither system currently motivates effort.

During the 2017/8 school year, the research team, alongside the KP E&SE Department, is introducing two new accountability systems with the aim of motivating educator effort and improving student learning outcomes. The first is the Annual Teacher Evaluation (ATE), which aims to address previous shortcomings in a number of ways. It covers the school rather than calendar year, and so a single cohort of students. It focuses on teaching-specific outcomes—presence of the teacher and his/her students, the teacher’s pedagogy, and test scores of the teacher’s students. It is conducted by relatively more independent, district-level inspectors rather than colleagues from the same school. Finally, and crucially, teachers’ ATE scores are explicitly linked to career progression via promotion tournaments within districts; teachers who perform well relative to peers in similar schools are fast-tracked for promotion, while those who perform poorly are held back.

The second new accountability system is the School Inspection Report (SIR). This school-level report is also undertaken by independent district inspectors and, in contrast to the previous system, is based on regular, structured site visits. Similar information is collected as for ATEs—presence of the head teacher, staff, and students within the school, and teacher pedagogy. The main difference compared to the ATE is that it is the head teacher’s career progression that is on the line; head teachers of schools that perform favourably compared to other schools in their circle (the equivalent of school district) are fast-tracked for promotion, while head teachers of schools that perform relatively poorly are held back.

The study team is collaborating with the KP E&SE Department to undertake a randomized controlled trial of these two new accountability systems. The trial is funded by the International Growth Centre (IGC), the Lahore University of Management Sciences (LUMS), the United States Agency for International Development (USAID), and the University of Oxford, and involves 240 rural, public primary schools in three districts in KP: Charsadda, Mardan, and Nowshera. There are three study arms, each with 80 schools: new ATE (teacher incentives), new SIR (head teacher incentives) and a ‘business-as-usual’ control. Both the ATE and SIR will be tied to promotion tournaments within circles (the equivalent of school districts). Twelve schools in each circle will participate in the study, 4 in each study arm. The study will provide the KP E&SE Department with an evidence base to assess whether strengthening of career incentives is an effective policy to motivate teacher effort and improve student learning outcomes.
External Link(s)
Registration Citation
Citation
Asad, Saher et al. 2018. "Improving Accountability in Khyber Pakhtunkhwa's Schools." AEA RCT Registry. March 30. https://doi.org/10.1257/rct.2815-1.0
Former Citation
Asad, Saher et al. 2018. "Improving Accountability in Khyber Pakhtunkhwa's Schools." AEA RCT Registry. March 30. https://www.socialscienceregistry.org/trials/2815/history/27504
Sponsors & Partners

There are documents in this trial unavailable to the public. Use the button below to request access to this information.

Request Information
Experimental Details
Interventions
Intervention(s)
The study sample of 240 schools consists of 20 circles (5 girls' circles and 15 boys' circles) with 12 schools in each circle, across three districts. Treatment was randomly assigned within circles. In each circle, there are 4 schools in the ATE treatment, 4 in the SIR treatment, and 4 in the 'business-as-usual' control.

ATE Intervention (80 schools)

All schools will receive three 'surprise' visits from an independent district inspector during the 2017/8 school year. The matching of inspectors to schools, and the dates of the site visits will be determined randomly by researchers. Inspectors will be provided with information on the school the evening before the visit, and will be instructed not to notify the school. If the inspector cannot visit that school that day, he/she will need to provide a reason and, if this is acceptable, he/she will be reassigned an alternative, randomly selected school the following day. All inspectors will be provided with: the ATE protocol; a tablet (pre-loaded with a relevant computer assisted personal interview (CAPI) program with which the ATE will be conducted, including teacher/student rosters) and tripod stand; a paper teacher pedagogy assessment tool; and a Wi-Fi box with credit to upload data. Upon arrival at the start of the school day, the inspector will explain the new ATE system to the head teacher and find out when the Grade 4 math class will be taught. Each inspector will then collect information on four performance measures: the presence and pedagogy of the Grade 4 math teacher, together with the presence of and learning outcomes for his/her students in the Grade 4 math class.

Scoring rubric. These four performance measures will be scored to produce an overall ATE. For the first three, scoring is based on absolute performance against a pre-specified objective rubric.

" Teacher attendance: present or excused absent = 8 points, unexcused absent = 0 points.
" Student attendance: sliding scale from 0 to 8 points for share of students present (max for >90% present).
" Teacher pedagogy: sliding scale from 0 to 8 points for average share of students engaged in active teaching activities (max for >88% engaged).

For student learning, scoring is based on relative performance and is intended to capture the main features of the 'pay-for-percentile' approach (Barlevy and Neal 2012). Specifically, once all data have been collected from all 80 ATE study schools, Grade 4 math teachers will be put into 'bins' of 8 based on their start of year (first visit) 'percentage of 50 Grade 4 math questions answered correctly' score. At the end of the year (third visit), these teachers will be ranked within bins. The Grade 4 math teacher in the school with the top rank will receive 8 marks, the next rank 7 marks, down to zero for the Grade 4 math teacher in the school with the bottom rank. This scoring of student learning will be done by the research team.

The detailed protocol for all four dimensions of this scoring rubric will be communicated to inspectors during an October 2017 training (following a September 2017 baseline survey, but before treatment begins), and to the Grade 4 math teacher at the start of the first surprise visit, via a letter outlining the intervention and the scoring.

Career incentives. The scores across these four performance measures will be aggregated into a single, cardinal score for each Grade 4 math teacher. Specifically, the overall score will be the sum of 10 measures (teacher attendance x3, student attendance x3, teacher pedagogy x3, student performance x3) scored on the 0-8-point scale. (While there will be a single measure of student performance, it will be included three times in the score calculation to ensure that the four categories are weighted equally as teachers were led to expect in the information sheet). These cardinal scores will feed into an ordinal ranking exercise. Within each of the 20 circles in the study, there are 4 schools randomly assigned to the ATE treatment. At the end of the school year, the four Grade 4 math teachers within each circle x ATE treatment pair will be ranked on the basis of their cardinal ATE score. In the event of a tie, the inspector (who will have visited all schools in the comparison set) will be asked to consider all four dimensions of performance and break the tie in favour of one teacher or the other. The top-ranked teacher will have his/her promotion fast-tracked by one-year, while the bottom-ranked teacher will have his/her promotion delayed by one-year. The two teachers in the middle of the ranking will experience no change.

In an attempt to prevent dysfunctional behaviour (collusion and/or demotivation) among teachers, the precise details of the promotions process will not be communicated to teachers. Details are provided in the ATE information sheet.

Audits. To incentivise inspectors to undertake the ATE thoroughly and objectively, the scores submitted will be audited by individuals with an arms-length relationship to the inspectors/schools. There are two auditor treatments: in the first auditors are drawn from a pool of district officials from non-study districts, and in the second auditors are drawn from a pool of secondary school teachers from study districts. Each inspector will have one of their 'surprise visits' assigned to the district official treatment, and another to the secondary school teacher treatment. Both types of auditor will review all of the information collected during the visits (photographic proof of attendance, video classroom observations) and will assign their own ATE score. Discrepancies between inspector and auditor scores will be noted. During training, inspectors will be told that their inspections will be audited and that their performance will be recognized and rewarded in two ways. First, inspectors who have performed well will be awarded a certificate at a public ceremony at the end of the school year. Second, inspector performance will be included in the dossier that forms part of their own Annual Confidential Report, and could therefore influence decisions about their own promotion and salary increments.

Marks assigned by the auditors will also be verified by the study team, as part of the process of evaluating the efficacy of the two different types of auditor. The auditor will be told this, but will not receive any other form of incentive.

SIR Intervention (80 schools).

The SIR intervention will be identical to the ATE intervention except that each inspector will collect information on the presence of the head teacher, all teachers, and all students, together with the pedagogy of two teachers.

Scoring rubric. These four performance measures will be scored to produce an overall SIR score. In each case, scoring is based on absolute performance against a pre-specified objective rubric.

" Head teacher attendance: present or excused absent = 8 points, unexcused absent = 0 points.
" Teacher attendance: sliding scale from 0 to 8 points for share of teachers present (max for >90% present).
" Student attendance: sliding scale from 0 to 8 points for share of students present (max for >90% present).
" Teacher pedagogy: sliding scale from 0 to 8 points for average share of students engaged in active teaching activities (max for >88% engaged) across the two classroom observations.

The detailed protocol for all four dimensions of this scoring rubric will be communicated to the inspector during training, and to the head teacher at the start of the first surprise visit via an information sheet.

Career incentives. The scores across these four performance measures will be aggregated into a single, cardinal score for each head teacher; specifically, the overall score will be the unweighted sum of 12 measures (head teacher attendance x3, teacher attendance x3, student attendance x3, teacher pedagogy x3) scored on the 0-8-point scale.

As in the ATE intervention, these cardinal scores will feed into the ordinal ranking exercise within circles. At the end of the school year, the schools within each circle x SIR treatment pair will be ranked on the basis of their cardinal SIR score. In the event of a tie, the inspector (who, by construction, visited all schools in the comparison set) will be asked to consider all four dimensions of performance and break the tie in favour of one school or the other. The head teacher of the top-ranked school will have his/her promotion fast-tracked by one-year, while the head teacher of the bottom-ranked school will have his/her promotion delayed by one-year. The two head teachers of the two middle-ranked schools will experience no change.

Again, in an attempt to prevent dysfunctional behaviour (collusion and/or demotivation) among head teachers, the precise details of the promotions process will not be communicated to head teachers. Details are provided in the SIR information sheet.

Audits. Scores awarded as part of the SIR will also be audited, following an identical protocol to the ATE intervention.

'Business-as-usual' control (80 schools)

The same district inspector will be visiting schools under different treatments. This creates the possibility of spillovers from the ATE and SIR interventions into controls schools. To mitigate this, at the start of inspector training (before any mention of the ATE and SIR interventions) inspectors will take part in facilitated group discussions about the protocols currently used for teacher 'annual confidential reports', and for school inspections, in their district. On the basis of these discussions, a written 'business-as-usual' protocol will be established for each district-gender. During training it will be stressed to each inspector that it is critically important that he/she adheres to this 'business-as-usual' protocol for all schools in the district that have not been included in either the ATE or SIR interventions.
Intervention Start Date
2017-11-07
Intervention End Date
2018-02-10
Primary Outcomes
Primary Outcomes (end points)
Our primary outcomes relate to students. The first is student learning, as measured by the difference in performance of students in Grade 4 math classes at endline compared to baseline on written, independently administered math tests. The second is student drop out, as measured by the difference in the size of the official class register (per grade) at endline versus baseline. The third is student attendance, as measured on the day (and day before) the baseline and endline surveys.
Primary Outcomes (explanation)
Student learning. Grade 4 math students in all 240 schools will sit a written math test at baseline in September 2017 and again at endline in March 2018. These test scores will be used to obtain two (item response theory) estimates of student learning, which will be used to test for treatment impacts in a (now standard) ANCOVA student-level specification.

Student drop out. We define drop outs as students who enrol at the start of the school year but either stop attending during the year and are withindrawn from the register or (except for those in the terminal grade) fail to enrol again the following school year. We will measure within-year drop outs via the baseline and endline surveys. Funding permitting, we will also obtain information on drop outs between years from a follow up survey at the start of the new school year in Fall 2018.

Student attendance. We measure student attendance on the day of the baseline survey as well as on the previous day. We again collect same day and previous day attendance at endline. These provide us with two measures of attendance.
Secondary Outcomes
Secondary Outcomes (end points)
Our first secondary outcome relates to teacher input, namely head teacher attendance and grade 4 math teacher attendance, as measured via the school’s official teacher register for the 5 days prior to the baseline and endline surveys. (This is a different measure of attendance from those collected during ‘surprise visits’ in ATE and SIR schools as part of the intervention.)

A further set of secondary outcomes relates to intrinsic motivation of teachers. Here, we have head teacher motivation and grade 4 math teacher motivation as measured via play in framed dictator games and head teacher motivation and grade 4 math teacher motivation as measured via the Perry Public Service scale. We will further examine which, if any, of these two measures of intrinsic motivation better predicts improvements in student learning and reductions in student drop out.

Another secondary outcome is head teacher and grade 4 math teacher drop out, as measured by a comparison of the official teacher register at baseline and endline.

Our final secondary outcome is performance of auditors, as measured by discrepancies in scoring inspections between the auditor and field staff employed by the PI team.
Secondary Outcomes (explanation)
Head teacher and grade 4 math teacher attendance. Head teacher attendance is recorded on the day of the (unannounced) baseline and endline surveys. During these surveys, enumerators will ask the head teacher for access to the official teacher register. From this they will record whether the teacher was present, absent but sanctioned, or absent but unsanctioned on each of the previous 5 school days. If there is no record, the enumerator will ask the head teacher to provide a response. Our measure of grade 4 math teacher attendance will average over these 5 days.

Head teacher and teacher motivation. Both head teachers and teachers will play a ‘lab-in-the-field’ dictator game at baseline and endline. Participants will be told that money contributed (rather than kept for personal use) will be used to buy resources for students at the school. These contributions are one measure of intrinisic motivation. At baseline and endline, head teachers and teachers will also be asked the full suite of Perry (1996) survey questions desligned to elicit strength of Public Service Motivation. An index based on these responses will be our other measure of intrinsic motivation.

Head teacher and grade 4 math teacher drop out. As for students, we define drop outs as teachers who are listed at the start of the school year but either stop attending during the year and are withindrawn from the teacher register or fail to be listed the following school year. We will measure within-year drop outs via the baseline and endline survey. Funding permitting, we will also obtain information on drop outs between years from a follow up survey at the start of the new school year in Fall 2018.

Performance of auditors. Inspectors will be told that independent, third parties will audit their inspections by watching and scoring the footgage captured on tablets. Field staff will also watch and score this footage. We will use the difference discrepancies between auditor and field staff scores as our measure of auditor performance. The distribution of performance scores will then be compared across auditor treatments.
Experimental Design
Experimental Design
This study will test whether (either of) two new school accountability systems can improve student learning outcomes in rural, public primary schools in KP province, Pakistan. The core of the experimental design is a school-level randomized control trial of either an Annual Teacher Evaluation (ATE), or a School Inspection Report (SIR), against a ‘business-as-usual’ control. These three systems are described in detail under the intervention section above. The trial will take place in 240 schools within 20 circles (school districts) during the 2017/18 school year. In each circle, 4 schools will be randomly assigned to one of three groups: ATE, SIR, or ‘business-as-usual’ control. Both the ATE and the SIR will be based on surprise visits by independent district inspectors, who will collect information on the presence of staff and students, pedagogy of teachers, and performance of students in math classes using electronic tablets. Inspectors will award scores based on this information which will then determine teacher and head teacher promotions through tournaments within circles. To incentivise inspectors to do a thorough and objective job, the electronic records of this information will be audited, with the audit reports placed on file to determine the inspectors’ own promotions. A secondary feature of the experimental design is an inspector-level randomized control trial of alternative forms of auditing, either by officials from non-study districts, or by secondary school teachers within the same district.
Experimental Design Details
Randomization Method
Assignment of schools to treatment (ATE, SIR, control), and assignment of auditors to inspectors (official from non-study district or secondary school teacher) will be done in office, by computer by the PI team.

Other assignments required as part of the intervention (e.g. choice of classroom to visit for teacher pedagogy observations under the SIR treatment) will be done in the field, automatically by the tablet on which the inspection is being undertaken using data entered by the inspector upon arrival at the school.
Randomization Unit
Accountability treatment: the unit of randomization is a school.
Auditor treatment: the unit of randomisation is an inspection visit.
Was the treatment clustered?
Yes
Experiment Characteristics
Sample size: planned number of clusters
Sampling of 240 schools. Following discussions with the E&SE Department, three districts (Charsadda, Mardan, and Nowshera) were identified as locations where it would be logistically feasible to operate. Enrolment data from Pakistan’s Independent Monitoring Unit identified 1,547 schools satisfying the following criteria:

• Rural, public, primary school
• Grade 4 enrolment between 10-50 inclusive (based on Fall 2016 enrolment)
• Grade 4 teacher neither up for promotion nor transfer within next two years, not the head teacher, and not having the Pakistan Reading Program (PRP) in place.

Dropping schools used in the same district as a pilot, leaves 870 schools for sampling—629 males and 241 female. These schools are organised into circles, which are administrative units that operate much like a school district; within each circle, there is a set of boys’ schools and a set of girls’ schools, each with separate E&SE department officials overseeing their administration. Accordingly, there are two circle-genders within each circle—one boys’ circle-gender and one girls’ circle-gender.

A total of 20 circle-gender pairs will be drawn from the 48 circle-gender pairs available. To be drawn, a circle-pair must satisfy the requirement that there are at least 18 schools (to allow for replacements if needed). Given the relative scarcity of girls’ schools, this requirement will result in 5 girls’ circles and 15 boys’ circles. Within each circle-gender pair, 12 schools will then be drawn for inclusion in the study. 12 schools x 20 clusters gives the 240 schools in the study.
Sample size: planned number of observations
6,240 students. 240 teachers. 240 headteachers. 40 inspectors.
Sample size (or number of clusters) by treatment arms
The ‘accountability’ treatment will be assigned randomly across the 12 schools within each of the 20 circle-genders: 4 schools in the ATE treatment, 4 in the SIR treatment, and 4 in the business-as-usual control. In total, there will therefore be 80 Annual Teacher Evaluation (ATE) schools, 80 School Inspection Report (SIR) schools, and 80 Business-as-usual Control schools.

There will also be an ‘auditing’ treatment. For each inspector, one visit will be randomly assigned to audit by an official from a non-study district and another visit to audit by a secondary school teacher from the same district.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Power calculations using math test scores from the Learning and Educational Achievement in Punjab Schools (LEAPS) study in Punjab, Pakistan show that we can detect and effect size of 0.125 standard deviations over baseline achievement with 80 schools in each of the two treatment groups and 80 schools in the control group.
Supporting Documents and Materials

There are documents in this trial unavailable to the public. Use the button below to request access to this information.

Request Information
IRB
INSTITUTIONAL REVIEW BOARDS (IRBs)
IRB Name
International Food Policy Research Institute (IFPRI) Institutional Review Board
IRB Approval Date
2017-08-23
IRB Approval Number
00007490
Post-Trial
Post Trial Information
Study Withdrawal
Intervention
Is the intervention completed?
No
Is data collection complete?
Data Publication
Data Publication
Is public data available?
No
Program Files
Program Files
Reports and Papers
Preliminary Reports
Relevant Papers