English(EN) S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

S-Agent框架增强VLMs进行3D空间推理 · 跟踪4个来源

作者 PulseAugur 编辑部 · [7 个来源] · 2026-06-18 00:00

研究人员推出S-Agent，一个旨在增强视觉语言模型（VLMs）在3D环境中进行空间推理的新框架。S-Agent整合了时间记忆和一系列空间工具，能够从多视图图像中持续理解3D世界，超越了静态、帧级别的分析。该框架允许VLMs充当语义规划器，决定需要什么证据，而空间工具则将物体定位在2D，将其提升到3D，并将这些信息聚合为空间知识。实验表明，S-Agent在无需重新训练的情况下就能改进开源和闭源VLMs，并且经过微调的版本S-Agent-8B，其性能可与GPT-5.4和Gemini 3等先进模型相媲美。 AI

影响该框架可能显著提高AI理解和与3D环境交互的能力，对机器人技术、自主系统和虚拟现实产生影响。

排序理由该集群报道了一篇关于AI模型空间推理新颖框架的最新研究论文。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 7 个来源。我们如何撰写摘要 →

报道来源 [7]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-18 00:00

S-Agent：空间工具使用引发空间智能推理

S-Agent is a spatial reasoning framework that enhances visual language models with temporal memory and hierarchical spatial tools to enable continuous 3D world understanding from multi-view imagery.
arXiv cs.CV TIER_1 English(EN) · Yalun Dai, Hao Li, Shulin Tian, Runmao Yao, Yuhao Dong, Fangzhou Hong, Zhaoxi Chen, Fangfu Liu, Baoliang Tian, Dingwen Zhang, Tao Wang, Kim-Hui Yap, Ziwei Liu · 2026-06-19 04:00

S-Agent：空间工具使用引发空间智能推理

arXiv:2606.20515v1 Announce Type: new Abstract: Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to static, stateless inference from isolated visual observations. We introdu…
arXiv cs.CV TIER_1 English(EN) · Ziwei Liu · 2026-06-18 17:34

S-Agent：空间工具使用引发空间智能推理

Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to static, stateless inference from isolated visual observations. We introduce \textbf{\textsc{S-Agent}}, a spatial tool-use…
MarkTechPost TIER_1 English(EN) · Asif Razzaq · 2026-06-19 22:51

NVIDIA AI 推出 SpatialClaw：一种将代码视为空间推理操作界面的无训练智能体

<p>SpatialClaw is a training-free agent that writes Python in a persistent kernel, composing perception tools for 3D spatial reasoning</p> <p>The post <a href="https://www.marktechpost.com/2026/06/19/nvidia-ai-introduce-spatialclaw-a-training-free-agent-that-treats-code-as-the-ac…
Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-20 04:31

🤖 NVIDIA的SpatialClaw将VLMs的空间推理能力提升了11.2个百分点 NVIDIA的SpatialClaw框架提高了视觉语言模型在空间推理方面的准确性

🤖 NVIDIA's SpatialClaw boosts spatial reasoning in VLMs by 11.2 points NVIDIA's SpatialClaw framework has increased spatial reasoning accuracy in vision language models by 11.2 points over SpaceTools, reaching 59.9% average accuracy across 20 benchmarks. This new training free fr…

链接 synestesia.uk/…/nvidia-s-spatialclaw-boos… synestesia.uk/…/nvidia-s-
Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-06-20 04:55

NVIDIA 的 SpatialClaw 是一个无需训练的框架，用于空间推理，将代码视为操作接口。在 20 个基准测试中，准确率达到 59.9%

NVIDIA's SpatialClaw is a training-free framework for spatial reasoning that treats code as the action interface. Across 20 benchmarks it reaches 59.9% accuracy, outperforming SpaceTools by 11.2 points. https://www. marktechpost.com/2026/06/19/nv idia-ai-introduce-spatialclaw-a-t…
Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-06-19 23:53

NVIDIA发布SpatialClaw，一款将代码视为空间推理动作界面的无训练AI代理。它使用Python内核来组合感知

NVIDIA has unveiled SpatialClaw, a training-free AI agent that treats code as the action interface for spatial reasoning. Using a Python kernel to compose perception tools, it achieves 59.9% accuracy across 20 benchmarks - outperforming prior approaches by over 11 points. https:/…

报道来源 [7]

S-Agent：空间工具使用引发空间智能推理

S-Agent：空间工具使用引发空间智能推理

S-Agent：空间工具使用引发空间智能推理

NVIDIA AI 推出 SpatialClaw：一种将代码视为空间推理操作界面的无训练智能体

🤖 NVIDIA的SpatialClaw将VLMs的空间推理能力提升了11.2个百分点 NVIDIA的SpatialClaw框架提高了视觉语言模型在空间推理方面的准确性

NVIDIA 的 SpatialClaw 是一个无需训练的框架，用于空间推理，将代码视为操作接口。在 20 个基准测试中，准确率达到 59.9%

NVIDIA发布SpatialClaw，一款将代码视为空间推理动作界面的无训练AI代理。它使用Python内核来组合感知

相关实体

相关话题