David Leonhardt’s value-added piece is characteristically thoughtful but I think he skips over some important issues here:
Value-added data is not gospel. Among the limitations, scores can bounce around from year to year for any one teacher, notes Ross Wiener of the Aspen Institute, who is generally a fan of the value-added approach. So a single year of scores — which some states may use for evaluation — can be misleading. In addition, students are not randomly assigned to teachers; indeed, principals may deliberately assign slow learners to certain teachers, unfairly lowering their scores. As for the tests themselves, most do not even try to measure the social skills that are crucial to early learning.
The value-added data probably can identify the best and worst teachers, researchers say, but it may not be very reliable at distinguishing among teachers in the middle of the pack. Joel Klein, New York’s reformist superintendent, told me that he considered the Los Angeles data powerful stuff. He also said, “I wouldn’t try to make big distinctions between the 47th and 55th percentiles.” Yet what parent would not be tempted to?
Many things in life fall along a normal distribution and there’s no reason to think teachers are an exception. I wouldn’t try to make a big distinction between the 47th and 55th percentiles in the average height of adult males, for example, because there isn’t a big distinction. That doesn’t mean that tape measures are an unreliable method of estimating height.
Similarly, a single year of scores can be a misleading way of estimating a teacher’s effectiveness over multiple years. But that’s also an inherently dumb way to interpret the meaning of a single year of scores. The fact that teacher value-added scores vary from year to year isn’t a bug, it’s a feature. Teacher effectiveness isn’t a permanent attribute–it’s what happened among a given group of students over a given length of time. There are plenty of unreasonable ways to interpret that information. That’s not an argument that the information shouldn’t exist.
Teachers are only disadvantaged by being assigned to slow learners if the value-added measure doesn’t account for students’ previous learning history. But widely-used systems like EVAAS do exactly that. In those measures, student growth is compared to their previous learning trajectory. If they learned slowly in previous grades, that becomes the baseline for comparison.
More broadly, the whole LA Times debate illustrates the long-run futility of aligning your interests against public information. Once schools started testing students annually, it was just a matter of time before someone took the inherently logical step of estimating how much student learning in a given classroom changed from the beginning of the year to the end. To be sure, there are all kinds of effective short-term ways to prevent this kind of information from being created and/or released. When I sat down seven years ago to write a policy paper on value-added, I assumed this day would come sooner than it did. Many aspects of that paper, my first attempt at the form, make me wince today–it’s far too long, awkwardly organized, and all but ignores the crucial issue of statistical error. But I think the basic point–value-added data can help create a more functional job market for teachers–still stands. And now that the genie is out of the bottle, it’s not going back in.