PulseAugur
实时 04:59:34
English(EN) Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

新框架揭示LLM思维表征的结构性限制

一篇新研究论文介绍了一个公理化评估框架,用于评估大型语言模型(LLMs)的潜在思维表征。该框架独立于下游基准测试分数,形式化了四个功能性公理:因果性、最小性、可分离性和稳定性。对23个推理任务中的开放权重LLMs进行审计后发现,没有模型能同时满足所有四个公理,这表明LLMs在表征内部思维方面存在结构性限制。 AI

影响 这项研究突显了当前LLM推理能力的根本性局限,表明需要新的架构或训练方法。

排序理由 该集群包含一篇详细介绍LLM新评估框架的学术论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新框架揭示LLM思维表征的结构性限制

报道来源 [3]

  1. arXiv cs.CL TIER_1 English(EN) · Fahd Seddik, Fatemeh Fard ·

    形式化潜在思维:LLM思维表征的四个公理

    arXiv:2606.27378v1 Announce Type: new Abstract: We introduce an axiomatic evaluation framework for latent thought representations in LLMs, comprising metrics that are independent of downstream benchmark scores and reveal representational failures that benchmark accuracy masks. Ex…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    正式化潜在思维:LLM中思维表征的四个公理

    An axiomatic evaluation framework reveals systematic failures in latent thought representations of LLMs across multiple reasoning tasks, demonstrating that current representations fail to satisfy fundamental functional axioms consistently across different model architectures.

  3. arXiv cs.CV TIER_1 English(EN) · Yang Liu ·

    CoLT:教会多模态模型进行潜在思维链思考

    Chain-of-thought (CoT) reasoning has enabled multi-modal large language models (MLLMs) to tackle complex visual reasoning tasks by generating explicit intermediate reasoning steps in natural language. However, this text-based reasoning paradigm is inherently slow at inference tim…