REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control
Researchers have developed REFLEX, a new self-refining paradigm for fact-checking that aims to improve the accuracy and faithfulness of explanations generated by large language models. This method disentangles factual content from stylistic elements by using self-disagreement signals to construct steering vectors. Experiments show REFLEX achieves state-of-the-art performance with a small number of self-refined samples and demonstrates effectiveness in reducing hallucinations and improving verdict accuracy on real-world data. AI
IMPACT Enhances LLM fact-checking reliability by reducing hallucinations and improving explanation faithfulness.