Automated Scoring in Learning Progression-Based Assessment: A Comparison of Researcher and Machine Interpretations

Hui Jin, Cynthia Lima, Limin Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Although AI transformer models have demonstrated notable capability in automated scoring, it is difficult to examine how and why these models fall short in scoring some responses. This study investigated how transformer models’ language processing and quantification processes can be leveraged to enhance the accuracy of automated scoring. Automated scoring was applied to five science items. Results indicate that including item descriptions prior to student responses provides additional contextual information to the transformer model, allowing it to generate automated scoring models with improved performance. These automated scoring models achieved scoring accuracy comparable to human raters. However, they struggle to evaluate responses that contain complex scientific terminology and to interpret responses that contain unusual symbols, atypical language errors, or logical inconsistencies. These findings underscore the importance of the efforts from both researchers and teachers in advancing the accuracy, fairness, and effectiveness of automated scoring.

Original languageEnglish
Pages (from-to)25-37
Number of pages13
JournalEducational Measurement: Issues and Practice
Volume44
Issue number3
DOIs
StatePublished - Aug 13 2025
Externally publishedYes

Scopus Subject Areas

  • Education

Keywords

  • automated scoring
  • constructed responses
  • learning progression

Fingerprint

Dive into the research topics of 'Automated Scoring in Learning Progression-Based Assessment: A Comparison of Researcher and Machine Interpretations'. Together they form a unique fingerprint.

Cite this