From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models
Researchers have investigated how Large Language Models (LLMs) represent essay quality internally. Using methods like linear probing and neuron-level analyses on eight different LLMs across multiple datasets, they found that information about essay quality is encoded in a linearly accessible form within the models' representations. This information emerges progressively through the model's layers and shows some transferability across different prompts and scoring rubrics. The study also identified specific neurons that correlate strongly with essay scores and whose behavior changes based on essay length. AI
IMPACT Provides insights into the interpretability of LLMs for automated essay scoring, suggesting structured representations of quality are present.