English(EN) When Do Diffusion Models learn to Generate Multiple Objects?

扩散模型因场景复杂性而在多对象生成方面遇到困难

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-30 22:18

一项新的研究论文调查了扩散模型在图像中生成多个对象方面的局限性。该研究引入了一个名为“mosaic”的受控数据集生成框架，以分析概念泛化和组合泛化。研究结果表明，场景复杂性而不是数据不平衡是影响多对象生成的主要因素，在低数据场景下计数任务尤其困难。 AI

影响强调了扩散模型在多对象生成方面的根本局限性，表明需要改进归纳偏置和数据设计。

排序理由关于扩散模型局限性的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Yujin Jeong, Arnas Uselis, Iro Laina, Seong Joon Oh, Anna Rohrbach · 2026-05-04 04:00

When Do Diffusion Models learn to Generate Multiple Objects?

arXiv:2605.00273v1 Announce Type: new Abstract: Text-to-image diffusion models achieve impressive visual fidelity, yet they remain unreliable in multi-object generation. Despite extensive empirical evidence of these failures, the underlying causes remain unclear. We begin by aski…
arXiv cs.CV TIER_1 English(EN) · Anna Rohrbach · 2026-04-30 22:18

When Do Diffusion Models learn to Generate Multiple Objects?

Text-to-image diffusion models achieve impressive visual fidelity, yet they remain unreliable in multi-object generation. Despite extensive empirical evidence of these failures, the underlying causes remain unclear. We begin by asking how much of this limitation arises from the d…