BALTO: Balanced Token-Level Policy Optimization for Hallucination Mitigation
Researchers have developed BALTO, a novel framework for mitigating hallucinations in large language models. This approach uses balanced token-level policy optimization to assign credit more effectively, addressing issues with localized hallucinations and optimization biases. Experiments on several benchmarks, including ConFiQA and RAGTruth, demonstrate that BALTO significantly improves faithfulness and offers a better trade-off between faithfulness and informativeness compared to existing methods. AI
IMPACT Introduces a new method to improve LLM faithfulness, potentially enabling more reliable deployment in knowledge-intensive applications.