LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition
Researchers have introduced LC-ERD, a novel framework designed to improve the reasoning capabilities of large language models. This method addresses challenges in self-alignment by mining latent logic within the model's reasoning processes. LC-ERD utilizes a Variational Logic Potential to denoise the reasoning manifold and a Multi-Agent Value Decomposition protocol to assess individual reasoning step utility, aiming to provide more granular and accurate supervision. AI
IMPACT Introduces a new method to improve LLM reasoning by addressing issues with self-alignment and reward signals.