Copyright NBC10 Boston

The Massachusetts Department of Elementary and Secondary Education notified nearly 200 school districts that at least one of their students was affected by an AI grading glitch that resulted in hundreds of essays being scored incorrectly on the MCAS exams. The NBC10 Investigators obtained the list of 192 school districts from a public records request to DESE. As we first reported last month, the technology incorrectly graded about 1,400 essays, which were later rescored after the error was detected and the state’s testing contractor performed a review. In one example we found, an elementary school teacher in Lowell read her students’ essays over the summer and discovered the scores some received did not seem to add up. For instance, a student received a zero on an essay that should have received six out of seven possible points. That teacher alerted district leaders, who then contacted DESE. window.addEventListener("message",function(a){if(void 0!==a.data["datawrapper-height"]){var e=document.querySelectorAll("iframe");for(var t in a.data["datawrapper-height"])for(var r,i=0;r=e[i];i++)if(r.contentWindow===a.source){var d=a.data["datawrapper-height"][t]+"px";r.style.height=d}}}); Shaleen Title, an adjunct professor and parent of a student in Malden, took her concerns to the school committee after seeing our story. Title told us she had no idea AI was used to grade essays on the standardized test. “It was news to me,” Title said. “It was astonishing and concerning. I really hope that we get a lot more communication about this as time goes on.” The educator said she worries about the AI showing bias against students with disabilities or those who are learning English as a second language. She also wonders if the technology penalizes a creative answer that does not fit a certain scoring rubric. “It’s one thing to grade math and science with AI, but an essay is something that should be original and creative,” Title said. “I think there should be a human looking at it.” But according to DESE, humans do not put a set of eyes on the majority of student essays. The agency said AI essay scoring works by using human-scored exemplars of what essays at each score point look like. The AI tool uses that information to score the essays. In addition, humans give 10 percent of the AI-scored essays a second read and compare their scores with the AI score to make sure there aren’t discrepancies. “You know, machine learning is good for some things, but it’s really not good for some other things,” said Jim Kang, a parent in Medford. As part of his job, Kang said he performs audits of AI evaluations. After seeing our story, Kang penned a “letter to the editor” in his local newspaper about why this thinks the technology should not be used to grade kids’ essays. “It seemed like a lot of people didn’t realize this is a big deal,” Kang said. “It’s disrespectful to students because they are told not to use AI when they write and read.” Proponents tout the positives of using AI to return scores back to school districts faster so appropriate action can be taken. For instance, test results can help identify a child in need of intervention or highlight a lesson plan for a teacher that did not seem to resonate with students. When asked about the issue last month, education commissioner Pedro Martinez called it a “learning experience” and said DESE is working with the testing vendor to make sure guardrails are in place to avoid a repeat problem. “I don’t want to discourage the use of technology because it allows us to get more thoughtful on the assessments and that way we can hone in on skills that are essential for students to reach our vision for a grad,” Martinez said. According to a copy of the communication DESE sent out, each school district received info about affected students, their previous essay score, and the updated score. “All updated scores are higher than the previously released preliminary scores,” said the message, which NBC10 obtained via public records request. “No student scores were lowered because of the changes.” Parents like Title said each score makes a difference, especially in districts being tracked to ensure they are making progress and meeting performance benchmarks. “This really matters. Students, districts and teachers are assessed based on these scores,” Title said. “If it’s not accurate, that really unfair and affects people in our community.”