What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis
Researchers have analyzed Latent Chain-of-Thought (CoT) from an information-theoretic viewpoint, identifying issues like gradient attenuation and representational drift. They propose a dual supervision approach: Trajectory Supervision for stepwise signals and Space Supervision to maintain latent space semantics. Experiments using the Unified Latent Probe (ULP) demonstrate that reasoning accuracy is tied to the information fidelity within the latent chain, suggesting a shift towards maximizing mutual information over geometric imitation. AI
IMPACT Provides a theoretical framework for improving latent reasoning in LLMs, potentially leading to more robust and accurate internal thought processes.