Trait-Aware Policy Optimization for Autoregressive Multi-Trait Essay Scoring
Researchers have introduced Trait-Aware Policy Optimization (TAPO), a novel post-training framework designed to enhance autoregressive models for multi-trait essay scoring. This method decomposes rewards across samples and traits, integrating global consistency, trait accuracy, and inter-trait dependencies. Experiments indicate that TAPO significantly improves scoring performance compared to standard supervised fine-tuning and scalar-reward optimization techniques. AI
IMPACT This research could lead to more nuanced and accurate AI-powered essay evaluation systems.