A few weeks ago, I wrote about research on new computer-based tools to assess student essays. I concluded that, for now, these tools might be best for establishing basic levels of writing proficiency. But, I also noted that the most important value of these tools may not be for high-stakes testing, but to increase writing practice and revision.
Randy Bennett, one of the world’s leading experts on technology-enhanced assessments, points me to his extremely helpful — and readable — new article, which offers advice to the assessment consortia as they look to implement automated scoring (not just in writing, but also for literacy and math).
Bennett’s paper distinguishes among the various types of automated scoring tasks, illustrating where automated scoring is most ready for high-stakes use. He makes a much needed call for transparency in scoring algorithms and even provides ideas on how automated and human-based scoring can improve one another (noting flaws in human-based scoring, too). Finally, he ends with this sensible approach:
The operational introduction of automated scoring might be thought of in terms of a continuum, where the most innovative (and least trustworthy) methods are always paired with well-supervised human scoring and the least innovative (but most trustworthy) methods run with only human checking of quality-control samples. All along the continuum, carefully examining the human-machine discrepancies will help identify weaknesses in both the automated and the human process, directing developers and program managers to where improvements are needed. As the automated technologies are refined and evidence amasses, they can be progressively moved up the continuum toward more independent use.