Explanation Quality Assessment as Ranking with Listwise Rewards
Researchers have reframed the evaluation of AI explanation quality from a generation task to a ranking problem. Instead of producing a single best explanation, models are trained to discern the relative quality among multiple candidate explanations. This approach, utilizing listwise and pairwise ranking models, has shown superior performance in separating explanation quality levels compared to regression methods. Notably, smaller encoder models trained on high-quality data can achieve performance comparable to much larger models, and these ranking-based rewards facilitate stable policy optimization where regression-based rewards fail. AI
IMPACT This research suggests that improved data quality and ranking-based reward models can lead to more efficient and stable training of AI systems, potentially reducing computational costs.