Now, its leader, John Ayers, has announced his resignation, a few weeks after the Institute retracted one of its bigger reports, one that seemed to find bright spots for students and schools in the struggling city. "Beating the Odds: Academic Performance and Vulnerable Student Populations in New Orleans Public High Schools," claimed that some New Orleans high schools were outperforming expectations, given the difficult circumstances faced by many of their students.
And that's exactly why education watchers around the country should care. The report attempted to use an approach called value-added modeling. And value-added is currently the golden fleece for anyone questing after what's really working in education.
Value-added models promise to provide a detailed, nuanced picture of school performance — to screen out the background noise and zero in on the impact of individual schools and even individual classrooms.
But value-added modeling, it turns out, is really, really hard.
To understand why, we need to think like researchers. Policymakers face a basic task. They want to compare the performance of different teachers and schools. No Child Left Behind, the federal law, did that by asking a simple question: How many children in this school score proficiently?
It turns out that question is too simple.
A school full of low-income students or English-language learners will, on average, produce much worse test scores and graduation rates, even if the administrators and teachers are working heroically. The students can be making two years' worth of progress in a single year and still miss the target on state tests, because they're just that far behind to begin with.
In a high-stakes accountability world, such a school can, in turn, be labeled "failing" and reorganized or closed, when it's actually doing a wonderful job.
So here's the workaround. Instead of just testing students once each year, and rating schools and teachers based on student proficiency, look at test scores over time. Say, a pretest at the beginning of a school year, and the same test at the end of the year, and then compare the scores. That's called a growth measure.
Value-added measurements are one step more complex. They track student growth, sometimes using multiple tests and other measures such as graduation rates. They then compare that growth to a model of predicted growth based on the students' characteristics.
Here's an example of why this technique is so promising. Say that in the third grade, students typically progress exactly one year. But in Ms. Smith's class, students are gaining one year and six months' worth of learning. In the same school, Ms. Jones has otherwise similar students, but her students gain just six months' of learning in that third grade year.
The goal is to see how much value a particular teacher or a particular school is adding, compared to what would be expected. Thus, "value added."
Growth measurements address the question: given where a student started, how far has she come? Value-added models ask: and how does that growth reflect back on the teacher or school?
Both student growth measures and value-added models are being adopted in most states. Education secretary Arne Duncan is a fan. He wrote on his blog in September, "No school or teacher should look bad because they took on kids with greater challenges. Growth is what matters." Joanne Weiss, Duncan's former chief of staff, told me last month, "If you focus on growth you can see which schools are improving rapidly and shouldn't be categorized as failures."
But there's a problem. The math behind value-added modeling is very, very tricky. The American Statistical Association, earlier this year, issued a public statement urging caution in the use of value-added models, especially in high-stakes conditions. Among the objections:
Value-added models are complex. They require "high-level statistical expertise" to do correctly;
They are based only on standardized test scores, which are a limited source of information about everything that happens in a school;
They measure correlation, not causation. So they don't necessarily tell you if a student's improvement or decline is due to a school or teacher or to some other unknown factor;
They are "unstable." Small changes to the tests or the assumptions used in the models can produce widely varying rankings.
And that brings us back to "Beating the Odds." Ayers told the New Orleans Times-Picayune that the flawed report and his sudden departure are unrelated. Jonah Evans, the director of policy at the Cowen Institute, declined to comment, and referred us to Tulane's PR department. "As a founder of the Institute, I remain committed to supporting the talented [Cowen Institute] team in the vital work it does," Tulane's just-retired president Scott Cowen said in a statement from the university.
But the incident should nevertheless serve as a caution both to education researchers, and to those of us who are making decisions based on their findings.