Improving Accountability in Khyber Pakhtunkhwa's Schools

Last registered on March 30, 2018

View Trial History

Pre-Trial

Trial Information

General Information

Title

Improving Accountability in Khyber Pakhtunkhwa's Schools

RCT ID

AEARCTR-0002815

Initial registration date

March 30, 2018

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

March 30, 2018, 5:35 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

Pakistan

Region

Khyber Pakhtunkhwa

Primary Investigator

Name

Clare Leaver

Affiliation

University of Oxford

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Katrina Kosec

PI Affiliation

International Food Policy Research Institute (IFPRI)

Contact Investigator

PI Name

Naureen Karachiwalla

PI Affiliation

International Food Policy Research Institute (IFPRI)

Contact Investigator

PI Name

Saher Asad

PI Affiliation

Lahore University of Management Sciences

Contact Investigator

PI Name

Masooma Habib

PI Affiliation

Consortium for Development Policy Research

Contact Investigator

Additional Trial Information

Status

On going

Start date

2017-09-18

End date

2018-09-30

Keywords

Education, Governance, Labor

Additional Keywords

JEL code(s)

Secondary IDs

Abstract

Pakistan has low student learning levels and educator motivation, and accountability is a pervasive problem. The Khyber Pakhtunkhwa (KP) Elementary and Secondary Education (E&SE) Department wants to improve learning by increasing teacher accountability through the reform of two dysfunctional accountability systems. The first is the system of ‘annual confidential reports’ for teachers. These evaluations were conducted at the end of the calendar year (overlapping two school years), performed by head teachers who find it difficult to criticize colleagues, and did not include teaching-specific measures known to improve learning. Moreover, although in theory evaluation scores should be tied to promotion, in practice promotions have been based on seniority, educational qualifications, and/or political connections. The second source of dysfunction is the system of school inspections. Head teachers face school inspections, but they are irregular, unstructured, and results are often unreported. Thus, neither system currently motivates effort.

During the 2017/8 school year, the research team, alongside the KP E&SE Department, is introducing two new accountability systems with the aim of motivating educator effort and improving student learning outcomes. The first is the Annual Teacher Evaluation (ATE), which aims to address previous shortcomings in a number of ways. It covers the school rather than calendar year, and so a single cohort of students. It focuses on teaching-specific outcomes—presence of the teacher and his/her students, the teacher’s pedagogy, and test scores of the teacher’s students. It is conducted by relatively more independent, district-level inspectors rather than colleagues from the same school. Finally, and crucially, teachers’ ATE scores are explicitly linked to career progression via promotion tournaments within districts; teachers who perform well relative to peers in similar schools are fast-tracked for promotion, while those who perform poorly are held back.

The second new accountability system is the School Inspection Report (SIR). This school-level report is also undertaken by independent district inspectors and, in contrast to the previous system, is based on regular, structured site visits. Similar information is collected as for ATEs—presence of the head teacher, staff, and students within the school, and teacher pedagogy. The main difference compared to the ATE is that it is the head teacher’s career progression that is on the line; head teachers of schools that perform favourably compared to other schools in their circle (the equivalent of school district) are fast-tracked for promotion, while head teachers of schools that perform relatively poorly are held back.

The study team is collaborating with the KP E&SE Department to undertake a randomized controlled trial of these two new accountability systems. The trial is funded by the International Growth Centre (IGC), the Lahore University of Management Sciences (LUMS), the United States Agency for International Development (USAID), and the University of Oxford, and involves 240 rural, public primary schools in three districts in KP: Charsadda, Mardan, and Nowshera. There are three study arms, each with 80 schools: new ATE (teacher incentives), new SIR (head teacher incentives) and a ‘business-as-usual’ control. Both the ATE and SIR will be tied to promotion tournaments within circles (the equivalent of school districts). Twelve schools in each circle will participate in the study, 4 in each study arm. The study will provide the KP E&SE Department with an evidence base to assess whether strengthening of career incentives is an effective policy to motivate teacher effort and improve student learning outcomes.

External Link(s)

Registration Citation

Citation

Asad, Saher et al. 2018. "Improving Accountability in Khyber Pakhtunkhwa's Schools." AEA RCT Registry. March 30. https://doi.org/10.1257/rct.2815-1.0

Former Citation

Asad, Saher et al. 2018. "Improving Accountability in Khyber Pakhtunkhwa's Schools." AEA RCT Registry. March 30. https://www.socialscienceregistry.org/trials/2815/history/27504

Sponsors & Partners

Partner

Name

Khyber Pakhtunkhwa (KP) Elementary and Secondary Education Department

Type

government

URL

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Experimental Details

Interventions

Intervention(s)

The study sample of 240 schools consists of 20 circles (5 girls' circles and 15 boys' circles) with 12 schools in each circle, across three districts. Treatment was randomly assigned within circles. In each circle, there are 4 schools in the ATE treatment, 4 in the SIR treatment, and 4 in the 'business-as-usual' control.

ATE Intervention (80 schools)

All schools will receive three 'surprise' visits from an independent district inspector during the 2017/8 school year. The matching of inspectors to schools, and the dates of the site visits will be determined randomly by researchers. Inspectors will be provided with information on the school the evening before the visit, and will be instructed not to notify the school. If the inspector cannot visit that school that day, he/she will need to provide a reason and, if this is acceptable, he/she will be reassigned an alternative, randomly selected school the following day. All inspectors will be provided with: the ATE protocol; a tablet (pre-loaded with a relevant computer assisted personal interview (CAPI) program with which the ATE will be conducted, including teacher/student rosters) and tripod stand; a paper teacher pedagogy assessment tool; and a Wi-Fi box with credit to upload data. Upon arrival at the start of the school day, the inspector will explain the new ATE system to the head teacher and find out when the Grade 4 math class will be taught. Each inspector will then collect information on four performance measures: the presence and pedagogy of the Grade 4 math teacher, together with the presence of and learning outcomes for his/her students in the Grade 4 math class.

Scoring rubric. These four performance measures will be scored to produce an overall ATE. For the first three, scoring is based on absolute performance against a pre-specified objective rubric.

" Teacher attendance: present or excused absent = 8 points, unexcused absent = 0 points.
" Student attendance: sliding scale from 0 to 8 points for share of students present (max for >90% present).
" Teacher pedagogy: sliding scale from 0 to 8 points for average share of students engaged in active teaching activities (max for >88% engaged).

For student learning, scoring is based on relative performance and is intended to capture the main features of the 'pay-for-percentile' approach (Barlevy and Neal 2012). Specifically, once all data have been collected from all 80 ATE study schools, Grade 4 math teachers will be put into 'bins' of 8 based on their start of year (first visit) 'percentage of 50 Grade 4 math questions answered correctly' score. At the end of the year (third visit), these teachers will be ranked within bins. The Grade 4 math teacher in the school with the top rank will receive 8 marks, the next rank 7 marks, down to zero for the Grade 4 math teacher in the school with the bottom rank. This scoring of student learning will be done by the research team.

The detailed protocol for all four dimensions of this scoring rubric will be communicated to inspectors during an October 2017 training (following a September 2017 baseline survey, but before treatment begins), and to the Grade 4 math teacher at the start of the first surprise visit, via a letter outlining the intervention and the scoring.

Career incentives. The scores across these four performance measures will be aggregated into a single, cardinal score for each Grade 4 math teacher. Specifically, the overall score will be the sum of 10 measures (teacher attendance x3, student attendance x3, teacher pedagogy x3, student performance x3) scored on the 0-8-point scale. (While there will be a single measure of student performance, it will be included three times in the score calculation to ensure that the four categories are weighted equally as teachers were led to expect in the information sheet). These cardinal scores will feed into an ordinal ranking exercise. Within each of the 20 circles in the study, there are 4 schools randomly assigned to the ATE treatment. At the end of the school year, the four Grade 4 math teachers within each circle x ATE treatment pair will be ranked on the basis of their cardinal ATE score. In the event of a tie, the inspector (who will have visited all schools in the comparison set) will be asked to consider all four dimensions of performance and break the tie in favour of one teacher or the other. The top-ranked teacher will have his/her promotion fast-tracked by one-year, while the bottom-ranked teacher will have his/her promotion delayed by one-year. The two teachers in the middle of the ranking will experience no change.

In an attempt to prevent dysfunctional behaviour (collusion and/or demotivation) among teachers, the precise details of the promotions process will not be communicated to teachers. Details are provided in the ATE information sheet.

Audits. To incentivise inspectors to undertake the ATE thoroughly and objectively, the scores submitted will be audited by individuals with an arms-length relationship to the inspectors/schools. There are two auditor treatments: in the first auditors are drawn from a pool of district officials from non-study districts, and in the second auditors are drawn from a pool of secondary school teachers from study districts. Each inspector will have one of their 'surprise visits' assigned to the district official treatment, and another to the secondary school teacher treatment. Both types of auditor will review all of the information collected during the visits (photographic proof of attendance, video classroom observations) and will assign their own ATE score. Discrepancies between inspector and auditor scores will be noted. During training, inspectors will be told that their inspections will be audited and that their performance will be recognized and rewarded in two ways. First, inspectors who have performed well will be awarded a certificate at a public ceremony at the end of the school year. Second, inspector performance will be included in the dossier that forms part of their own Annual Confidential Report, and could therefore influence decisions about their own promotion and salary increments.

Marks assigned by the auditors will also be verified by the study team, as part of the process of evaluating the efficacy of the two different types of auditor. The auditor will be told this, but will not receive any other form of incentive.

SIR Intervention (80 schools).

The SIR intervention will be identical to the ATE intervention except that each inspector will collect information on the presence of the head teacher, all teachers, and all students, together with the pedagogy of two teachers.

Scoring rubric. These four performance measures will be scored to produce an overall SIR score. In each case, scoring is based on absolute performance against a pre-specified objective rubric.

" Head teacher attendance: present or excused absent = 8 points, unexcused absent = 0 points.
" Teacher attendance: sliding scale from 0 to 8 points for share of teachers present (max for >90% present).
" Student attendance: sliding scale from 0 to 8 points for share of students present (max for >90% present).
" Teacher pedagogy: sliding scale from 0 to 8 points for average share of students engaged in active teaching activities (max for >88% engaged) across the two classroom observations.

The detailed protocol for all four dimensions of this scoring rubric will be communicated to the inspector during training, and to the head teacher at the start of the first surprise visit via an information sheet.

Career incentives. The scores across these four performance measures will be aggregated into a single, cardinal score for each head teacher; specifically, the overall score will be the unweighted sum of 12 measures (head teacher attendance x3, teacher attendance x3, student attendance x3, teacher pedagogy x3) scored on the 0-8-point scale.

As in the ATE intervention, these cardinal scores will feed into the ordinal ranking exercise within circles. At the end of the school year, the schools within each circle x SIR treatment pair will be ranked on the basis of their cardinal SIR score. In the event of a tie, the inspector (who, by construction, visited all schools in the comparison set) will be asked to consider all four dimensions of performance and break the tie in favour of one school or the other. The head teacher of the top-ranked school will have his/her promotion fast-tracked by one-year, while the head teacher of the bottom-ranked school will have his/her promotion delayed by one-year. The two head teachers of the two middle-ranked schools will experience no change.

Again, in an attempt to prevent dysfunctional behaviour (collusion and/or demotivation) among head teachers, the precise details of the promotions process will not be communicated to head teachers. Details are provided in the SIR information sheet.

Audits. Scores awarded as part of the SIR will also be audited, following an identical protocol to the ATE intervention.

'Business-as-usual' control (80 schools)

The same district inspector will be visiting schools under different treatments. This creates the possibility of spillovers from the ATE and SIR interventions into controls schools. To mitigate this, at the start of inspector training (before any mention of the ATE and SIR interventions) inspectors will take part in facilitated group discussions about the protocols currently used for teacher 'annual confidential reports', and for school inspections, in their district. On the basis of these discussions, a written 'business-as-usual' protocol will be established for each district-gender. During training it will be stressed to each inspector that it is critically important that he/she adheres to this 'business-as-usual' protocol for all schools in the district that have not been included in either the ATE or SIR interventions.

Intervention (Hidden)

Protocol for ATE performance measures

Grade 4 math teacher attendance. The inspector will attempt to locate the Grade 4 math teacher within the school. If this teacher can be found within 30 minutes of the school's start time, the inspector will take a photo of him/her holding his/her photo ID and record 'present'. If the teacher is absent but this has been excused by the head teacher, the inspector will ask for supporting documentation (e.g. a letter from a district official, a letter from a doctor, or a text message already sent to the head teacher etc.) and take a photo of it and record 'absent excused'. If there is no documentation, the inspector will record 'absent unexcused'.

Grade 4 math class student attendance. The inspector will record student attendance using the roster pre-loaded on the tablet and take a clear photo of the entire class. For each student, the options are: 'present', 'absent but attended this term', and 'absent, has not attended this term'.

Grade 4 math class teacher pedagogy. At the start of the Grade 4 math class, the inspector will set up the tablet on the tripod stand at a high vantage point at the back of a classroom, ensuring that the teacher, blackboard, and all students can be seen clearly. The tablet will then record the entire class. Having set up the recording, the inspector will then complete the teacher pedagogy assessment tool on pen and paper. The tool asks inspectors to record what is happening (from 4 options) during the first ten seconds of each five-minute interval throughout the first 30 minutes of the lesson. These options are: A) the teacher is in class and teaching/doing a learning activity with the students; B) the teacher is in class but is not doing a teaching/learning activity; C) the teacher is not in class but the students are in class; D) neither the teacher nor the students are in class. If Option A is chosen, the inspector will be prompted to choose up to three teaching activities from a pre-specified list. For each of these activities the inspector will record the number of students who are actively engaged in that activity. After the class is over, the inspector will stop the recording and leave the class. At this point, the inspector will immediately enter the information from the paper pedagogy assessment tool into the CAPI program on the tablet. The tablet will then automatically produce a teacher pedagogy score-specifically, it will filter active and non-active teaching activities (which are not apparent to the inspector or teacher). Non-active teaching activities will be scored as zero. Active teaching activities will be scored as the fraction of students engaged. The tablet will compute the average share of students engaged across all active activities and all of the five-minute intervals during which observations were taken.

Grade 4 math class student learning. At a time when the Grade 4 math class is not in session, the inspector will arrange the students from this class in alphabetical order. The inspector will set up the tablet on the kickstand to record footage of the students and will then ask one math question to one student at a time, reading aloud from prompts on the tablet. Each student will be given a pre-specified time to work out an answer using pen and paper (if needed) and will then be asked to answer the question aloud. The tablet will record the video in the background so that the inspector does not need to switch between the video and question screens between questions. Fifty questions will be asked in total, the first to the first student in alphabetical order and so on, with some students answering more than one question in the case of fewer than 50 students in the class. The tablet will then automatically produce a student performance score, based on the percentage of the 50 questions answered correctly.

Protocol for SIR performance measures

Head teacher attendance. The inspector will attempt to locate the head teacher within the school. If the head teacher can be found within 30 minutes of the school's start time, the inspector will take a photo of him/her holding his/her photo ID and record 'present'. If the head teacher is absent but the deputy says he/she has an excused absence, the inspector will ask for supporting documentation (e.g. a letter from a district official, a letter from a doctor etc.) and take a photo of it and record 'absent excused'. If there is no acceptable documentation, the inspector will record 'absent unexcused'.

Teacher attendance. The inspector will ask the head teacher to help them locate (within 30 minutes of the start time of the school) all teachers in the school, together with details of any excused absences. For those present, the inspector will take a photo with ID. Scoring of teacher attendance will be as in the ATE intervention.

Student attendance. At a time when all classes are in session, the inspector will visit each classroom in turn to record student attendance using the number of enrolled children in that class pre-loaded on the tablet (provided from the research team's September 2017 baseline survey) and take a clear photo of the entire class. Scoring of student attendance will be as in the ATE intervention.

Teacher pedagogy. The inspector will be asked to list all of the classes (grade and subject) that will be taught during the next teaching slot. The tablet will then randomly select one class to be observed. The inspector will then follow the teacher pedagogy assessment protocol described in the ATE intervention. Once this observation is complete, the inspector will repeat this process to obtain a second teacher pedagogy score from a different class.

Intervention Start Date

2017-11-07

Intervention End Date

2018-02-10

Primary Outcomes

Primary Outcomes (end points)

Our primary outcomes relate to students. The first is student learning, as measured by the difference in performance of students in Grade 4 math classes at endline compared to baseline on written, independently administered math tests. The second is student drop out, as measured by the difference in the size of the official class register (per grade) at endline versus baseline. The third is student attendance, as measured on the day (and day before) the baseline and endline surveys.

Primary Outcomes (explanation)

Student learning. Grade 4 math students in all 240 schools will sit a written math test at baseline in September 2017 and again at endline in March 2018. These test scores will be used to obtain two (item response theory) estimates of student learning, which will be used to test for treatment impacts in a (now standard) ANCOVA student-level specification.

Student drop out. We define drop outs as students who enrol at the start of the school year but either stop attending during the year and are withindrawn from the register or (except for those in the terminal grade) fail to enrol again the following school year. We will measure within-year drop outs via the baseline and endline surveys. Funding permitting, we will also obtain information on drop outs between years from a follow up survey at the start of the new school year in Fall 2018.

Student attendance. We measure student attendance on the day of the baseline survey as well as on the previous day. We again collect same day and previous day attendance at endline. These provide us with two measures of attendance.

Secondary Outcomes

Secondary Outcomes (end points)

Our first secondary outcome relates to teacher input, namely head teacher attendance and grade 4 math teacher attendance, as measured via the school’s official teacher register for the 5 days prior to the baseline and endline surveys. (This is a different measure of attendance from those collected during ‘surprise visits’ in ATE and SIR schools as part of the intervention.)

A further set of secondary outcomes relates to intrinsic motivation of teachers. Here, we have head teacher motivation and grade 4 math teacher motivation as measured via play in framed dictator games and head teacher motivation and grade 4 math teacher motivation as measured via the Perry Public Service scale. We will further examine which, if any, of these two measures of intrinsic motivation better predicts improvements in student learning and reductions in student drop out.

Another secondary outcome is head teacher and grade 4 math teacher drop out, as measured by a comparison of the official teacher register at baseline and endline.

Our final secondary outcome is performance of auditors, as measured by discrepancies in scoring inspections between the auditor and field staff employed by the PI team.

Secondary Outcomes (explanation)

Head teacher and grade 4 math teacher attendance. Head teacher attendance is recorded on the day of the (unannounced) baseline and endline surveys. During these surveys, enumerators will ask the head teacher for access to the official teacher register. From this they will record whether the teacher was present, absent but sanctioned, or absent but unsanctioned on each of the previous 5 school days. If there is no record, the enumerator will ask the head teacher to provide a response. Our measure of grade 4 math teacher attendance will average over these 5 days.

Head teacher and teacher motivation. Both head teachers and teachers will play a ‘lab-in-the-field’ dictator game at baseline and endline. Participants will be told that money contributed (rather than kept for personal use) will be used to buy resources for students at the school. These contributions are one measure of intrinisic motivation. At baseline and endline, head teachers and teachers will also be asked the full suite of Perry (1996) survey questions desligned to elicit strength of Public Service Motivation. An index based on these responses will be our other measure of intrinsic motivation.

Head teacher and grade 4 math teacher drop out. As for students, we define drop outs as teachers who are listed at the start of the school year but either stop attending during the year and are withindrawn from the teacher register or fail to be listed the following school year. We will measure within-year drop outs via the baseline and endline survey. Funding permitting, we will also obtain information on drop outs between years from a follow up survey at the start of the new school year in Fall 2018.

Performance of auditors. Inspectors will be told that independent, third parties will audit their inspections by watching and scoring the footgage captured on tablets. Field staff will also watch and score this footage. We will use the difference discrepancies between auditor and field staff scores as our measure of auditor performance. The distribution of performance scores will then be compared across auditor treatments.

Experimental Design

This study will test whether (either of) two new school accountability systems can improve student learning outcomes in rural, public primary schools in KP province, Pakistan. The core of the experimental design is a school-level randomized control trial of either an Annual Teacher Evaluation (ATE), or a School Inspection Report (SIR), against a ‘business-as-usual’ control. These three systems are described in detail under the intervention section above. The trial will take place in 240 schools within 20 circles (school districts) during the 2017/18 school year. In each circle, 4 schools will be randomly assigned to one of three groups: ATE, SIR, or ‘business-as-usual’ control. Both the ATE and the SIR will be based on surprise visits by independent district inspectors, who will collect information on the presence of staff and students, pedagogy of teachers, and performance of students in math classes using electronic tablets. Inspectors will award scores based on this information which will then determine teacher and head teacher promotions through tournaments within circles. To incentivise inspectors to do a thorough and objective job, the electronic records of this information will be audited, with the audit reports placed on file to determine the inspectors’ own promotions. A secondary feature of the experimental design is an inspector-level randomized control trial of alternative forms of auditing, either by officials from non-study districts, or by secondary school teachers within the same district.

Experimental Design Details

Randomization Method

Assignment of schools to treatment (ATE, SIR, control), and assignment of auditors to inspectors (official from non-study district or secondary school teacher) will be done in office, by computer by the PI team.

Other assignments required as part of the intervention (e.g. choice of classroom to visit for teacher pedagogy observations under the SIR treatment) will be done in the field, automatically by the tablet on which the inspection is being undertaken using data entered by the inspector upon arrival at the school.

Randomization Unit

Accountability treatment: the unit of randomization is a school.
Auditor treatment: the unit of randomisation is an inspection visit.

Was the treatment clustered?

Yes

Experiment Characteristics

Sample size: planned number of clusters

Sampling of 240 schools. Following discussions with the E&SE Department, three districts (Charsadda, Mardan, and Nowshera) were identified as locations where it would be logistically feasible to operate. Enrolment data from Pakistan’s Independent Monitoring Unit identified 1,547 schools satisfying the following criteria:

• Rural, public, primary school
• Grade 4 enrolment between 10-50 inclusive (based on Fall 2016 enrolment)
• Grade 4 teacher neither up for promotion nor transfer within next two years, not the head teacher, and not having the Pakistan Reading Program (PRP) in place.

Dropping schools used in the same district as a pilot, leaves 870 schools for sampling—629 males and 241 female. These schools are organised into circles, which are administrative units that operate much like a school district; within each circle, there is a set of boys’ schools and a set of girls’ schools, each with separate E&SE department officials overseeing their administration. Accordingly, there are two circle-genders within each circle—one boys’ circle-gender and one girls’ circle-gender.

A total of 20 circle-gender pairs will be drawn from the 48 circle-gender pairs available. To be drawn, a circle-pair must satisfy the requirement that there are at least 18 schools (to allow for replacements if needed). Given the relative scarcity of girls’ schools, this requirement will result in 5 girls’ circles and 15 boys’ circles. Within each circle-gender pair, 12 schools will then be drawn for inclusion in the study. 12 schools x 20 clusters gives the 240 schools in the study.

Sample size: planned number of observations

6,240 students. 240 teachers. 240 headteachers. 40 inspectors.

Sample size (or number of clusters) by treatment arms

The ‘accountability’ treatment will be assigned randomly across the 12 schools within each of the 20 circle-genders: 4 schools in the ATE treatment, 4 in the SIR treatment, and 4 in the business-as-usual control. In total, there will therefore be 80 Annual Teacher Evaluation (ATE) schools, 80 School Inspection Report (SIR) schools, and 80 Business-as-usual Control schools.

There will also be an ‘auditing’ treatment. For each inspector, one visit will be randomly assigned to audit by an official from a non-study district and another visit to audit by a secondary school teacher from the same district.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Power calculations using math test scores from the Learning and Educational Achievement in Punjab Schools (LEAPS) study in Punjab, Pakistan show that we can detect and effect size of 0.125 standard deviations over baseline achievement with 80 schools in each of the two treatment groups and 80 schools in the control group.

Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

IRB