PulseAugur
实时 07:45:35
English(EN) MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

扩散 Transformer 推动图像生成和材料迁移发展

研究人员在用于图像生成和处理的扩散 Transformer (DiT) 架构方面取得了几项进展。其中一篇论文探讨了在像素空间 DiT 中使用寄存器令牌以提高收敛性和生成质量,发现它们能产生更清晰的特征图。另一篇论文提出了 HyperDiT,它使用超连接的跨尺度交互和寄存器来桥接语义和像素流形,以实现高保真生成。ElasticDiT 通过动态调整架构和使用稀疏注意力来专注于移动设备的效率,而 DreamSR 通过结合全局和局部文本特征来增强超分辨率。最后,DealMaTeMaTe 通过消除文本引导并依赖 DiT 框架内的图像输入来简化材料迁移。 AI

影响 扩散 Transformer 的这些进展提供了更高的图像生成保真度、移动设备的效率,以及超分辨率和材料迁移的新功能。

排序理由 arXiv 上发表了多篇研究论文,详细介绍了扩散 Transformer 的新架构和技术。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 11 个来源。 我们如何撰写摘要 →

扩散 Transformer 推动图像生成和材料迁移发展

报道来源 [11]

  1. arXiv cs.CV TIER_1 English(EN) · Yunhai Tong ·

    通过不动点迭代对离散扩散图像生成器进行单步蒸馏

    Discrete diffusion models excel at visual synthesis but rely on slow, iterative decoding. Existing single-step distillation methods attempt to bypass this bottleneck, either by training auxiliary score networks that effectively double compute, or by introducing specialized parame…

  2. arXiv stat.ML TIER_1 English(EN) · Lifu Wei, Yinuo Ren, Naichen Shi, Yiping Lu ·

    SURGE:无需近似的自由粒子滤波器用于扩散代理

    arXiv:2605.18745v1 Announce Type: new Abstract: Diffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require …

  3. arXiv stat.ML TIER_1 English(EN) · Yiping Lu ·

    SURGE:无需近似的自由粒子滤波器用于扩散模型代理

    Diffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require repeated score or gradient evaluations, introduc…

  4. arXiv cs.CV TIER_1 English(EN) · Haohuan Fu ·

    学习平衡:用于参考遥感图像超分辨率的解耦Siamese扩散Transformer

    Diffusion-based methods demonstrate significant potential for remote sensing image super-resolution at large scaling factors, particularly in reference-based super-resolution (RefSR) where high-resolution reference images provide critical fine-grained texture priors. However, exi…

  5. arXiv cs.CV TIER_1 English(EN) · Yan Li ·

    FrequencyBooster:高保真像素扩散的全频建模

    To circumvent the inherent fidelity bottlenecks and optimization misalignment of VAE-based latent diffusion, pixel-space diffusion models have emerged as a compelling end-to-end paradigm. However, existing pixel diffusion models often struggle to balance computational efficiency …

  6. arXiv cs.CV TIER_1 English(EN) · Dmitry Baranchuk ·

    寄存器对 Pixel-Space Diffusion Transformers 的重要性

    Vision Transformers (ViTs) are known to exhibit high-norm patch-token outliers that degrade feature map quality, a problem effectively mitigated by \textit{register tokens}. As diffusion models increasingly adopt transformer architectures and move toward pixel-space training, the…

  7. arXiv cs.CV TIER_1 English(EN) · Yan Li ·

    HyperDiT:用于高保真像素空间扩散的超连接Transformer

    Pixel-space diffusion models bypass the reconstruction bottleneck of Variational Autoencoders (VAEs) but face a fundamental "granularity dilemma": capturing global semantics favors large patch scales, while generating high-fidelity details demands fine-grained inputs. To address …

  8. arXiv cs.CV TIER_1 English(EN) · Xinghao Chen ·

    ElasticDiT:通过弹性架构和稀疏注意力实现高效的Diffusion Transformer,用于移动设备上的高分辨率图像生成

    The Diffusion Transformer (DiT) architecture is the state-of-the-art paradigm for high-fidelity image generation, underpinning models like Stable Diffusion-3 and FLUX.1. However, deploying these models on resource-constrained mobile devices entails prohibitive computational and m…

  9. arXiv cs.CV TIER_1 English(EN) · Yitong Wang ·

    DreamSR:通过感受野增强的扩散Transformer实现超高分辨率图像超分辨率

    Large-scale pre-trained diffusion models have been extensively adopted for real-world image Super-Resolution because of their powerful generative priors through textual guidance. However, when super-resolving high-resolution images with patch-wise inference strategy, most existin…

  10. arXiv cs.CV TIER_1 English(EN) · Zitong Yu ·

    DealMaTe:通过扩散 Transformer 进行多维材料传输

    Recently, diffusion-based material transfer methods rely on image fine-tuning or complex architectures with auxiliary networks but face challenges such as text dependency, additional computational costs, and feature misalignment. To address these limitations, we propose \textbf{D…

  11. arXiv cs.CV TIER_1 English(EN) · Xiu Li ·

    MaTe:图像是您通过扩散 Transformer 进行材料转移的全部所需

    Recent diffusion-based methods for material transfer rely on image fine-tuning or complex architectures with assistive networks, but face challenges including text dependency, extra computational costs, and feature misalignment. To address these limitations, we propose MaTe, a st…