A new framework for evaluating the robustness of explanations in enterprise NLP systems has been proposed. This framework uses a leave-one-out occlusion method to assess how stable token-level explanations are under various perturbations. The study found that larger decoder-based LLMs, such as Llama 70B, provide significantly more stable explanations than smaller encoder-based models, with improved stability correlating with model scale. AI
影响 Provides a method for selecting more reliable NLP models for enterprise use, especially in compliance-sensitive applications.
排序理由 Academic paper proposing a new evaluation framework for NLP explanations.
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →