The Psychology of Grading Writing
A recent InsideHigherEd.com story relays some problems and successes ETS has had in constructing an automated essay grading program.
You first reaction to this might be: How can I get that for my classes? A second, more cyncical reaction might be: Oh no, this is why I’m paid big money to be an instructor! Soon colleges will be filled with classrooms led by Watsons!
After those natural responses, I came to a third: This article is fascinating for what it reveals about the psychology of grading written work. Here are some intriguing passages (bolds mine):
1. On grading (automated versus human) and revising papers: “Andrew Klobucar, assistant professor of humanities at NJIT, said that he has also noticed a key change in student behavior since the introduction of E-Rater. One of the constant complaints of writing instructors is that students won’t revise. But at NJIT, Klobucar said, first-year students are willing to revise essays multiple times when they are reviewed through the automated system, and in fact have come to embrace revision if it does not involve turning in papers to live instructors. Students appear to view handing in multiple versions of a draft to a human to be “corrective, even punitive,” in ways that discourage them, he said.”
2. On what we unconsciously look for in grading essays: “[Les] Perelman did not dispute the possibility that automated essay grading may correlate highly with human grading in the NJIT experiment. The problem, he said, is that his research has demonstrated that there is a flaw in almost all standardized grading of short essays: In the short essay, short time limit format, scoring correlates strongly with essay length, so the person who gets the most words on paper generally does better — regardless of writing quality, and regardless of human or computer grading. In four separate studies of the SAT essay tests, Perelman explained, high correlations were found between length and score.”
3. On word complexity and human vs. automated grading: “Word complexity is judged, among other things, by average word length, so, [Perelman] suggested, students are rewarded for using “antidisestablishmentarianism,” regardless of whether it really advances the essay. And the formula also explicitly rewards length of essay. Perelman went on to show how Lincoln would have received a poor grade on the Gettysburg Address (except perhaps for starting with “four score,” since it was short and to the point). And he showed how the ETS rubric directly contradicts most of George Orwell’s legendary rules of writing. For instance, he noted that Orwell instructed us to “never use a metaphor, simile, or other figure of speech which you are used to seeing in print,” to “never use a long word where a short one will do” and that “if it is possible to cut a word out, always cut it out.” ETS would take off points for following all of that good advice, he said.” – TL