Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 7h

CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning

Researchers have introduced CoT-Space, a new theoretical framework designed to better understand the internal reasoning processes of large language models (LLMs). This framework reframes the multi-step Chain-of-Thought (CoT) reasoning, typically enhanced by Reinforcement Learning (RL), from a simple token-prediction task to an optimization problem within a continuous semantic space. The model explains how the optimal CoT length emerges from the trade-off between underfitting and overfitting, providing a mechanistic explanation for internal test-time scaling. AI

IMPACT Provides a theoretical foundation for optimizing LLM reasoning trajectories, potentially improving performance on complex tasks.

Reinforcement Learning
Large Language Models
CoT-Space
Zeyu Gan