English(EN)SceneConductor: 3D Scene Generation from Single Image with Multi-Agent Orchestration
新方法使用多智能体系统生成和编辑3D室内场景
作者PulseAugur 编辑部·[7 个来源]·
研究人员开发了生成和编辑3D室内场景的新方法。SceneConductor使用多智能体协同框架将过程分解为初始化、环境构建和精炼阶段,提高了几何精度和真实感。AccioScene采用图扩散和交互驱动的批评者,从文本提示创建连贯的3D场景,侧重于功能合理性和人机交互。HDSL引入了一种用于结构化场景表示的分层领域特定语言,使LLM智能体能够通过局部修订更有效地生成和编辑场景。
AI
arXiv:2502.06819v2 Announce Type: replace Abstract: This paper presents a framework for generating 3D indoor scenes from text prompts. Existing methods often formulate scene synthesis as an object layout prediction problem conditioned on a single input modality, such as a text de…
arXiv cs.AI
TIER_1English(EN)·Jeonghwan Kim, Yushi Lan, Yongwei Chen, Hieu Trung Nguyen, Chuanyu Pan, Xingang Pan·
arXiv:2606.08402v1 Announce Type: cross Abstract: Generating complete 3D scenes from a single image requires inferring globally consistent geometry, object relationships, and environmental context from inherently ambiguous visual evidence. Despite recent progress in joint layout-…
Generating complete 3D scenes from a single image requires inferring globally consistent geometry, object relationships, and environmental context from inherently ambiguous visual evidence. Despite recent progress in joint layout-and-mesh generation, existing methods often rely o…
arXiv:2606.13345v1 Announce Type: new Abstract: Existing 3D scene editing methods typically rely on per-scene optimization over explicit 3D representations or cascaded edit-and-reconstruct pipelines, resulting in high test-time cost, limited 3D awareness, and structural inconsist…
Existing 3D scene editing methods typically rely on per-scene optimization over explicit 3D representations or cascaded edit-and-reconstruct pipelines, resulting in high test-time cost, limited 3D awareness, and structural inconsistencies. To couple appearance synthesis and geome…
arXiv:2606.09738v1 Announce Type: new Abstract: Text-driven indoor scene generation and editing require an intermediate representation that language models can both produce and revise. Existing LLM-based systems often rely on scene graphs or global constraint lists, which are com…
Text-driven indoor scene generation and editing require an intermediate representation that language models can both produce and revise. Existing LLM-based systems often rely on scene graphs or global constraint lists, which are compact but underspecify local geometry and make in…