Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 7h

From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment

Researchers have developed a new framework to interpret how automated scoring models assign quality ratings to complex language performances, such as classroom transcripts. This framework combines model-agnostic Shapley-value attributions with explanations generated by large language models (LLMs). In tests on the CLASS framework's Quality of Feedback dimension, Shapley values proved more reliable and transferable than LLM-generated rationales for explaining model predictions. AI

IMPACT Provides a more robust method for evaluating the faithfulness and transferability of explanations from AI models in educational assessment.

LLMs
SHAP
NCTE corpus
CLASS framework