新模块提升 Diffusion Transformer 图像质量

作者 PulseAugur 编辑部 · [1 个来源] · 2026-07-01 04:00

研究人员引入了一个质量表示模块（QRM），旨在增强文本到图像的扩散模型，特别是 Diffusion Transformer (DiT)。这个轻量级模块从现有模型输入中学习一个质量感知表示，并生成向量来调整 DiT transformer 块内的自适应 LayerNorm 调制。通过注入这种质量敏感信号，QRM 旨在提高生成图像的保真度和一致性，而无需改变核心扩散过程或采样计划。实验表明，与标准的 DiT 模型相比，QRM 能够持续提高图像质量。 AI

影响该模块可能带来更一致、更高保真度的扩散模型图像生成。

排序理由详细介绍扩散模型新模块的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Luke Budny, Yuhong Guo, Kevin Cheung · 2026-07-01 04:00

Quality-Aware Modulation for Diffusion Transformers

arXiv:2606.30934v1 Announce Type: new Abstract: Modern text-to-image diffusion models, such as diffusion transformers (DiT), rely on timestep or prompt embeddings to modulate the strength of the denoising process in each timestep. While this modulation communicates the current no…

报道来源 [1]

Quality-Aware Modulation for Diffusion Transformers

相关实体

相关话题