January 05, 2018
January 10, 2018 3:23 PM EST

The University of Chicago
Summary: The purpose of the evaluation is to assess the effectiveness of a classroom-based program intended to increase the vocabulary of preschool and primary school children. The program, called the Big Word Club (BWC), consists of videos, books and activities intended to help children learn one new word per day over a school year. The intent of the program is not only to teach specific new words, but also to increase children’s interest in words and literacy in general and thereby improve school success. The evaluation will assess the program’s success in achieving the narrow goal of increasing children’s receptive vocabulary. Specifically, we will assess receptive vocabulary using a test that includes words from the BWC as well as a set of age-appropriate words not included in the BWC. The evaluation will include a minimum of 50 schools and maximum of 65 schools all located in the Southwestern United States.
The Big Word Club (BWC) is a digital learning program that uses books, songs, animation and dance to introduce children to a new word every day of the school year. It is intended for children in preschool to grade 4 with different classroom materials depending on the grade. In general the words are “big” in the sense that many are not typical of the vocabulary of such young children. For example the words for preschoolers include gargantuan, primate, prehensile, equator, and slither. Each week the BWC provides classroom teachers with nine new videos based on that week’s theme. The videos include five that introduce the word for each day, one animated book, one animated music video, and one dance video all of which include the five words for that week. The videos also include a review of the week’s words. The BWC provides much flexibility to teachers who can use the videos any time during the day. Each video is only 3-4 minutes long so implementing the BWC is not costly in terms of classroom time. Many teachers report using the animated books at story time, the dances as a break during the day and the songs during sing along time. The review is typically done on Fridays. It is intended to supplement and not substitute for the normal classroom literacy curriculum.
Schools are randomly assigned to participate in either the BWC or to be in a business as usual control group.
Randomization completed in office by a computer.
School level
50-65 schools
Estimated 750-1,000 children assessed twice (1,500 - 2,000 total observations)
50 - 65 schools with an estimated 16 - 20 children per classroom.
We are grateful to Kenya Heard and Rohit Naimpally for assistance with power calculations for this intervention. The power estimates are based on the assumption we will have 46 schools. Even from a conservative estimate of only 16 students per school, we obtain adequate power for detecting a doubling effect on a vocabulary test. We assume that the assessment will ask children to identify 30 words, 15 of which are only covered in the BWC curriculum and 15 of which are more general and that the treatment group gets at least twice as many of the first set of 15 words correct as the control group. Furthermore for the intra-school correlation, we consider two commonly used scenarios: 0.2 and 0.4. The lower one is fairly standard in education, and to be safe, we also considered something on the higher end i.e. 0.4. Figure 1 shows the minimum detectable effects under 14 different scenarios (7 for each of the intra-cluster correlation assumptions) for the number of words that the control group might get right on the BWC part of the curriculum. For power, we're assuming that the experiment should have at least 80% power (at a significance level of 0.05). We consider whether the minimum detectable effect size under each of the 14 scenarios translates into the treatment group getting at least twice as many words right. For instance, under an ICC of 0.2, if the control group children were to get 1 out of 15 BWC words correct (~7% of the words), the experiment would be able to detect an effect equal to a 14 percentage point increase for the treatment group (translating to the treatment group getting ~3 words correct out of 15). Since this minimum detectable effect size is equal to a gain of more than twice as many words for the treatment group, under these conditions, the experiment would not be sufficiently powered. But for all cases where the control participants answer an average of 3 questions correct or more and treatment participants answer more than twice correct, there is sufficient power, and there is sufficient power in the case of the control participants answering 2 correct answers on average when the intra-school correlation is 0.2. These plausible cases suggest we have adequate power, given that the main outcome will include the specific words the program tries to teach. We expect to have more than 16 students per school and more than the 46 schools, both of which will increase power further. We will also try to obtain additional background variables such as gender, race, home language and Teaching Strategies Gold scores so that we might condition on background variables to reduce the intra-school correlation further.
The University of Chicago SBS IRB
