English(EN) Demystifying Variance in Circuit Discovery of LLMs

新的大型语言模型（LLM）电路发现方法解决了方差问题

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-15 16:25

一篇新发表在arXiv上的研究论文探讨了大型语言模型（LLM）电路发现方法中的可变性。研究确定了三种主要的方差来源：重采样、改写和样本级方差。作者引入了CEAP，一种通过减少重采样方差来改进现有EAP-IG技术的新方法。他们还认为，改写方差表明，由于提示可以以多种方式激活不同的内部电路，因此大型语言模型（LLM）可能本质上难以控制。他们认为，样本级方差在很大程度上是良性的，与不忠诚的定义有关，而不是电路缺陷。 AI

影响引入了一种改进大型语言模型（LLM）可解释性和控制力的新方法，有助于理解和引导模型行为。

排序理由该集群包含一篇在arXiv上发表的关于大型语言模型（LLM）电路发现新方法的 ist 研究论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Frank Zhengqing Wu, Francesco Tonin, Volkan Cevher · 2026-06-16 04:00

Demystifying Variance in Circuit Discovery of LLMs

arXiv:2606.16920v1 Announce Type: cross Abstract: Circuit discovery is a key technique in mechanistic interpretability to pinpoint the model components that are crucial for performing a given task. Although the current state-of-the-art method (EAP-IG) performs well on the metric …
arXiv cs.AI TIER_1 English(EN) · Volkan Cevher · 2026-06-15 16:25

Demystifying Variance in Circuit Discovery of LLMs

Circuit discovery is a key technique in mechanistic interpretability to pinpoint the model components that are crucial for performing a given task. Although the current state-of-the-art method (EAP-IG) performs well on the metric of (un)faithfulness, it suffers from substantial v…

报道来源 [2]

Demystifying Variance in Circuit Discovery of LLMs

Demystifying Variance in Circuit Discovery of LLMs

相关实体

相关话题