LLM self-explanations compared to human rationales in text classification

By PulseAugur Editorial · [1 sources] · 2026-05-22 04:00

A new research paper systematically compares self-generated explanations from instruction-tuned LLMs with human-provided rationales in text classification tasks. The study evaluates the plausibility and faithfulness of these self-explanations across sentiment classification, forced labor detection, and claim verification. Findings indicate that the alignment between LLM self-explanations and human rationales varies with text length and task complexity, though LLMs do produce faithful token-level rationales. AI

IMPACT This research provides insights into the quality and faithfulness of LLM-generated explanations, which is crucial for improving model interpretability and user trust.

RANK_REASON The cluster contains an academic paper detailing a systematic comparison of LLM-generated explanations with human rationales. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Stephanie Brandl, Oliver Eberle · 2026-05-22 04:00

A Systematic Comparison between Extractive Self-Explanations and Human Rationales in Text Classification

arXiv:2410.03296v4 Announce Type: replace-cross Abstract: Instruction-tuned LLMs are able to provide \textit{an} explanation about their output to users by generating self-explanations, without requiring the application of complex interpretability techniques. In this paper, we an…

COVERAGE [1]

A Systematic Comparison between Extractive Self-Explanations and Human Rationales in Text Classification

RELATED ENTITIES

RELATED TOPICS