English(EN) WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

新基准测试LLM创建交互式3D世界的能力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-02 04:00

研究人员推出了WorldCoder-Bench，这是一个旨在评估大型语言模型根据自然语言提示合成物理基础的3D交互世界的能力的新基准。该基准包含模拟、渲染和应用场景中的2000多个任务，并纳入了隐藏的行为契约来测试程序集成和状态管理。对九个前沿模型的初步评估显示，即使是最好的系统，验证覆盖率也低于30%，这凸显了在保持状态一致性和交互链方面存在的重大挑战。 AI

影响该基准测试有望推动LLM生成复杂、交互式3D环境的能力的进步，从而影响游戏开发和虚拟世界创建。

排序理由该集群包含一篇介绍用于评估AI能力的新基准的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Shuo Lu, Yinuo Xu, Kecheng Yu, Siru Jiang, Yongcan Yu, Yubin Wang, Haitao Yang, Yuxiang Zhang, Bin Wang, Ran He, Jian Liang · 2026-06-02 04:00

WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

arXiv:2606.01869v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly asked not only to write static interfaces, but to construct executable interactive worlds from natural language. Browser-native 3D, commonly built with Three.js, is a natural next fronti…

报道来源 [1]

WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

相关实体

相关话题