Primary Outcomes (end points)
Child Measures:
A comprehensive battery of child development measures was used to assess language, fine motor, executive function (attention, inhibition, working memory), problem solving, social/emotional and numeracy/math skills. These measures cover abilities that typically begin to emerge and progress early in life; are encouraged through commonly recommended preschool practices; and are believed to be important for primary school success (Copple & Bredekamp, 2009; Duncan et al., 2007; Sabol & Pianta, 2012). As described below, all selected assessments had demonstrated reliability and/or validity in either Malawi or other sub-Saharan countries. Each test was translated and adapted as necessary for use in the present study. At the 36-month follow-up (Round 3), some scales were dropped because they no longer showed good variability in performance (i.e., were too easy), while other tests indicative of expanding capacities were added. For all analyses, scores were standardized to have a mean equal to 0 and standard deviation equal to 1 in the control group for ease of interpretation.
The battery included the following measures (see Appendix Table 1 for schedule of administration across rounds of data collection):
1. Malawi Developmental Assessment Tool (MDAT(Gladstone et al., 2008), a test created and validated specifically for use in rural Malawi with children 0-7 years of age. Subscales for assessing language and fine motor/perception skills were administered. Items were scored as pass or fail, and a total summed score was calculated overall, and for each subscale.
2. Peabody Picture Vocabulary Test - IV (PPVT-IV,(Dunn, 1965) a test of receptive vocabulary that measures comprehension of words through picture identification. The PPVT has been widely used throughout the world for assessing the effects of various interventions on child language, including Mozambique (TVIP, the Spanish version of PPVT, administered in the local language; (Martinez et al., 2012)) and Madagascar (Fernald et al., 2009). Specific items (both words and pictures) were modified for use in Malawi. For example, we replaced “apple” with “papaya,” a fruit that is well known throughout the country, and was estimated to be of similar difficulty as the word “apple” would be in the United States. Items were scored as pass or fail, and a summed continuous score was calculated.
3. The Leiter-R Sustained Attention (LSA) task (Roid & Miller, 1997), a language-free measure that assesses how well children can continue to maintain attention and accuracy during a timed visual search task. The measure has successfully detected group differences in performance in Madagascar (Fernald et al., 2011). Total adjusted scores were determined by subtracting the numbers of errors from the number of correct responses.
4. The Strengths and Difficulties Questionnaire (SDQ) (Goodman, 2001; Woerner et al., 2004), a brief, parent-report questionnaire that screens for both behavioral problems and pro-social (positive) behaviors. All items were translated, back-translated and approved by the test author. The SDQ has been used in several African countries, including Kenya (Oburu, 2005) and South Africa (Cluver et al., 2007). Scores were determined for the four behavior problem subscales, the pro-social subscale, and a total difficulties (problem) score.
5. Kaufman Assessment Battery-Children, 2nd Edition (KABC-II) (Kaufman & Kaufman, 2004), three scales were adopted: Hand Movements is a non-verbal, short-term motor memory task requiring children to copy increasingly difficult hand movement sequences. A total score of passed items was calculated. Number Recall is a short-term auditory memory task requiring children to repeat a series of increasingly difficult number sequences. As children learn numbers in English, no translation was required. The total score reflects the number of passed items. Finally, Triangles is a non-verbal problem-solving task that requires children to complete increasingly complex patterns and figures with plastic and foam shapes. The number of items correctly completed was summed. The Kaufman scales have been used in Kenya (Holding et al., 1999), Senegal (Boivin, 2002), and Uganda (Bangirana et al., 2009).
6. Early Grade Mathematics Assessment (EGMA) (Brombacher, 2011), a tool developed by USAID to measure early knowledge of numbers and basic math skills, validated in Malawi. A great advantage of the EGMA is that there are Malawian norms available, as well as norms from neighboring countries, allowing for easy comparison and interpretation of scores. Three subscales (number recognition, quantity discrimination, and addition) were administered. Passed items for each subscale were summed to create subscale scores.
Anthropometric measurements were made at baseline to (i) assess balance across groups, and (ii) control for any direct or indirect influences growth faltering (specifically stunting or chronic malnutrition) might have on the other child assessments. Child height and weight were measured according to the nearest 0.1 cm and 0.1 kg, respectively, following established guidelines (Cogill, 2003). Height-for-age (HAZ), weight-for-height (WHZ), and weight-for-age Z-scores (WAZ) were then calculated using the 2006 WHO growth standards (WHO Multicentre Growth Reference Study Group, 2006).
All enumerators were trained for a minimum of two weeks at each data collection time point, and all followed standardized procedures for administering each measure. Inter-rater reliability, as indicated by the correlation between scores obtained by two different testers for the same child, was estimated by having the enumerators observe and score videotaped administrations. Average inter-rater reliabilities were 0.95 (for MDAT Fine Motor at baseline and Round 2), 0.88 (for MDAT Language at baseline and Round 2), 0.94 (PPVT, baseline), and 0.96 (Triangles, Round 3).
Primary Caregiver Measures:
In addition to gathering data on household and demographic characteristics, information was gathered on the primary caregiver’s health and the home environment. At baseline and the first follow-up (18 months post-baseline, Round 2), the status of the primary guardian’s mental health functioning was assessed. At all rounds, the provision of household stimulation for learning and development and the use of positive disciplinary techniques were measured. As with the child outcomes, resulting scores were standardized. The following scales were administered to each caregiver; scales that were child specific (e.g. the Parenting Stress Index) were administered once for each child:
1. The Center for Epidemiological Studies, Depression (CESD) (Radloff, 1977), a 20-item scale that assesses depressive symptoms in adults that has been widely used throughout the world.
2. The Parenting Stress Index (PSI) (Abidin, 1990), an adapted 43-item scale that asks parents or guardians to report on their perceptions of parenting the target children in the study. Higher scores indicate more stress related to parenting this child.
3. Support for Learning and Positive Parenting (UNICEF, 2010) module was adapted from the UNICEF Multi-Indicator Cluster Surveys. Support for learning is determined by both the availability of materials (books, toys etc.) that promote development, as well as activities adults do with children to encourage learning. Typical behavior control strategies or disciplinary techniques were also measured.
CBCC Measures:
Extensive information on the characteristics of the CBCC, staff and quality of staff-child interactions was gathered at baseline and both follow-up rounds of data collection. The CBCC questionnaire and observation measure were adapted from the La Escala de Evaluación de la Calidad Educativa de Centros de Educación Preescolares (ECCP) (Martínez et al., 2004) from Mexico and a preschool quality tool developed for use in Cambodia (Rao et al., 2012). Ministry personnel and Malawian child development experts suggested additional items. Items included information on CBCC building structure and space, availability of learning materials, provision of meals, typical daily schedule of activities, number of children enrolled, and availability of water and toilets. The majority of survey questions were administered to the CBCC director, but the preschool teacher(s) also provided some information on their training, education and experience. Responses to the questionnaire were standardized with a mean of 0 and standard deviation of 1.
In addition to the CBCC survey, observations were conducted while the CBCC was operating to provide an objective account of classroom organization, activities and caregiver-child interactions. These observations are a valuable corollary to the more subjective data gathered via the questionnaire in that they provide classroom information gathered firsthand. While individual inter-rater reliability obtained during piloting was good (0.76), we learned that allowing the enumerator pairs to observe (taking brief notes) and then reach consensus together for each index was a more thorough method for capturing information about the CBCC’s overall functioning. To complete the observations, pairs of trained fieldworkers arrived unannounced just as the CBCC was opening, and observed normal center activities for one hour. The enumerators rated responses together across a variety of indices that included, for example, teachers’ style of teaching various concepts, encouragement of child participation in learning, time spent reading, time spent engaged with children (either individually or in groups), response to children’s needs, disciplinary strategies, use of small and large groups, and interactions that promote children’s social development. Scores for the observation measure were derived from a principal components analysis (PCA).