English(EN) MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

扩散 Transformer 推动图像生成和材料迁移发展

作者 PulseAugur 编辑部 · [11 个来源] · 2026-05-15 06:31

研究人员在用于图像生成和处理的扩散 Transformer (DiT) 架构方面取得了几项进展。其中一篇论文探讨了在像素空间 DiT 中使用寄存器令牌以提高收敛性和生成质量，发现它们能产生更清晰的特征图。另一篇论文提出了 HyperDiT，它使用超连接的跨尺度交互和寄存器来桥接语义和像素流形，以实现高保真生成。ElasticDiT 通过动态调整架构和使用稀疏注意力来专注于移动设备的效率，而 DreamSR 通过结合全局和局部文本特征来增强超分辨率。最后，DealMaTe 和 MaTe 通过消除文本引导并依赖 DiT 框架内的图像输入来简化材料迁移。 AI

影响扩散 Transformer 的这些进展提供了更高的图像生成保真度、移动设备的效率，以及超分辨率和材料迁移的新功能。

排序理由 arXiv 上发表了多篇研究论文，详细介绍了扩散 Transformer 的新架构和技术。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 11 个来源。我们如何撰写摘要 →

报道来源 [11]

arXiv cs.CV TIER_1 English(EN) · Yunhai Tong · 2026-05-20 17:59

通过不动点迭代对离散扩散图像生成器进行单步蒸馏

Discrete diffusion models excel at visual synthesis but rely on slow, iterative decoding. Existing single-step distillation methods attempt to bypass this bottleneck, either by training auxiliary score networks that effectively double compute, or by introducing specialized parame…
arXiv stat.ML TIER_1 English(EN) · Lifu Wei, Yinuo Ren, Naichen Shi, Yiping Lu · 2026-05-19 04:00

SURGE：无需近似的自由粒子滤波器用于扩散代理

arXiv:2605.18745v1 Announce Type: new Abstract: Diffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require …
arXiv stat.ML TIER_1 English(EN) · Yiping Lu · 2026-05-18 17:59

SURGE：无需近似的自由粒子滤波器用于扩散模型代理

Diffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require repeated score or gradient evaluations, introduc…
arXiv cs.CV TIER_1 English(EN) · Haohuan Fu · 2026-05-18 07:35

学习平衡：用于参考遥感图像超分辨率的解耦Siamese扩散Transformer

Diffusion-based methods demonstrate significant potential for remote sensing image super-resolution at large scaling factors, particularly in reference-based super-resolution (RefSR) where high-resolution reference images provide critical fine-grained texture priors. However, exi…
arXiv cs.CV TIER_1 English(EN) · Yan Li · 2026-05-18 02:25

FrequencyBooster：高保真像素扩散的全频建模

To circumvent the inherent fidelity bottlenecks and optimization misalignment of VAE-based latent diffusion, pixel-space diffusion models have emerged as a compelling end-to-end paradigm. However, existing pixel diffusion models often struggle to balance computational efficiency …
arXiv cs.CV TIER_1 English(EN) · Dmitry Baranchuk · 2026-05-15 16:27

寄存器对 Pixel-Space Diffusion Transformers 的重要性

Vision Transformers (ViTs) are known to exhibit high-norm patch-token outliers that degrade feature map quality, a problem effectively mitigated by \textit{register tokens}. As diffusion models increasingly adopt transformer architectures and move toward pixel-space training, the…
arXiv cs.CV TIER_1 English(EN) · Yan Li · 2026-05-15 08:51

HyperDiT：用于高保真像素空间扩散的超连接Transformer

Pixel-space diffusion models bypass the reconstruction bottleneck of Variational Autoencoders (VAEs) but face a fundamental "granularity dilemma": capturing global semantics favors large patch scales, while generating high-fidelity details demands fine-grained inputs. To address …
arXiv cs.CV TIER_1 English(EN) · Xinghao Chen · 2026-05-15 07:13

ElasticDiT：通过弹性架构和稀疏注意力实现高效的Diffusion Transformer，用于移动设备上的高分辨率图像生成

The Diffusion Transformer (DiT) architecture is the state-of-the-art paradigm for high-fidelity image generation, underpinning models like Stable Diffusion-3 and FLUX.1. However, deploying these models on resource-constrained mobile devices entails prohibitive computational and m…
arXiv cs.CV TIER_1 English(EN) · Yitong Wang · 2026-05-15 07:08

DreamSR：通过感受野增强的扩散Transformer实现超高分辨率图像超分辨率

Large-scale pre-trained diffusion models have been extensively adopted for real-world image Super-Resolution because of their powerful generative priors through textual guidance. However, when super-resolving high-resolution images with patch-wise inference strategy, most existin…
arXiv cs.CV TIER_1 English(EN) · Zitong Yu · 2026-05-15 07:06

DealMaTe：通过扩散 Transformer 进行多维材料传输

Recently, diffusion-based material transfer methods rely on image fine-tuning or complex architectures with auxiliary networks but face challenges such as text dependency, additional computational costs, and feature misalignment. To address these limitations, we propose \textbf{D…
arXiv cs.CV TIER_1 English(EN) · Xiu Li · 2026-05-15 06:31

MaTe：图像是您通过扩散 Transformer 进行材料转移的全部所需

Recent diffusion-based methods for material transfer rely on image fine-tuning or complex architectures with assistive networks, but face challenges including text dependency, extra computational costs, and feature misalignment. To address these limitations, we propose MaTe, a st…

报道来源 [11]

相关实体

相关话题