home about the BEE review methods sign up for updates resources


Reading / Upper Elementary

Full Report pdf (2.2 MB) || Educator's Summary pdf (399 KB)

Review Methods

An exhaustive search considered more than 2000 published and unpublished articles. It included those that met the following criteria.

  • Schools or classroom using each program had to be compared to randomly assigned or well-matched control groups.
  • Study duration had to be at least 12 weeks.
  • Outcome measures had to be assessment of the reading content being taught in all classes. Almost all are standardized tests or state assessments.
  • The review placed particular emphasis on studies in which schools, teachers, or students were assigned at random to experimental or control groups.

Program Ratings Basis

Programs were rated according to the overall strength of the evidence support in their effects on reading achievement. “Effect size” (ES) is the proportion of a standard deviation by which a treatment exceeds a control group. Average effect sizes were weighted by sample sizes in computing means. The categories are as follows:

strong evidenceStrong Evidence of Effectiveness: At least two prospective studies (i.e., not post hoc), one of which is a large (n=250) randomized or randomized quasi-experimental study, or multiple smaller studies, with a sample size-weighted effect size of at least +0.20, and a collective sample size across all studies of at least 500 students. To qualify for this category, effect sizes from the randomized studies must have a weighted mean effect size of at least +0.20.

moderate evidenceModerate Evidence of Effectiveness: At least two randomized or matched prospective studies, with a collective sample size of 500 students, and a weighted mean effect size of at least +0.20.

limited evidenceLimited Evidence of Effectiveness: Strong Evidence of Modest Effects: Studies meet the criteria for “Moderate Evidence of Effectiveness” except that the weighted mean effect size is +0.10 to +0.19.

limited evidenceLimited Evidence of Effectiveness: Weak Evidence with Notable Effect: A weighted mean effect size of at least +0.20 based on one or more qualifying studies of any qualifying design insufficient in number or sample size to meet the criteria for “Moderate Evidence of Effectiveness.”

InsufficientInsufficient Evidence of Effectiveness: One or more qualifying studies not meeting the criteria for “Limited Evidence of Effectiveness.”

N No Qualifying studies: No studies met inclusion standards.

     
   

 

   
 
 

 

 
about CDDRE
privacy disclosure contact us site map
Back to Homepage Back to Homepage Center for Data-Driven Reform in Education Johns Hopkins University School of Education