New Diffusion Transformers Advance Image Generation and Transmission

作者 PulseAugur 编辑部 · [5 个来源] · 2026-06-02 01:42

研究人员正在开发新的扩散 Transformer 模型，用于先进的图像生成和传输。其中一种方法 DDM-SSCC，将扩散语言模型应用于无损像素级图像传输，在噪声信道条件下表现优于现有方法。另一个模型 HyperDiT，通过连接语义和像素流形，利用超连接跨尺度交互来实现高保真像素生成。此外，PixelDiT，一个拥有 13 亿参数的模型，提供无 VAE 的文本到图像生成功能，并支持图像编辑和各种宽高比。 AI

影响这些扩散 Transformer 的进步正在突破图像生成保真度和效率的界限，可能影响需要高质量视觉内容和鲁棒图像传输的领域。

排序理由多篇关于用于图像生成和传输的新型扩散 Transformer 架构的研究论文和社区讨论。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。我们如何撰写摘要 →

New Diffusion Transformers Advance Image Generation and Transmission

报道来源 [5]

arXiv cs.AI TIER_1 English(EN) · Tianqi Ren, Rongpeng Li, Xianfu Chen, Yingyu Li, Zhifeng Zhao · 2026-06-06 04:00

为无损像素级图像传输调整扩散语言模型

arXiv:2606.06273v1 Announce Type: cross Abstract: Lossless pixel-level image transmission is a fundamental regime beyond semantic communications, because exact recovery requires both accurate symbol probability modeling and reliable delivery over noisy channels. This paper propos…
arXiv cs.AI TIER_1 English(EN) · Zhifeng Zhao · 2026-06-04 15:14

为无损像素级图像传输调整扩散语言模型

Lossless pixel-level image transmission is a fundamental regime beyond semantic communications, because exact recovery requires both accurate symbol probability modeling and reliable delivery over noisy channels. This paper proposes DDM-SSCC, a discrete-diffusion-model-based sepa…
arXiv cs.CV TIER_1 English(EN) · Yu He, Lichen Ma, Zipeng Guo, Xinyuan Shan, Jingling Fu, Dong Chen, Junshi Huang, Yan Li · 2026-06-04 04:00

HyperDiT：用于高保真像素空间扩散的超连接Transformer

arXiv:2605.15741v2 Announce Type: replace Abstract: Pixel-space diffusion models bypass the reconstruction bottleneck of Variational Autoencoders (VAEs) but face a fundamental "granularity dilemma": capturing global semantics favors large patch scales, while generating high-fidel…
r/StableDiffusion TIER_2 English(EN) · /u/CornyShed · 2026-06-02 15:34

PixelDiT：用于图像生成的像素扩散 Transformer，1.3B，无 VAE

<div class="md"><p>PixelDiT is a 1.3B parameter text-to-image model by NVidia with image editing capabilities.</p> <p>Key features:</p> <ul> <li>VAE-free</li> <li>Dual-level architecture: Patch-level DiT + Pixel-level DiT</li> <li>MM-DiT text-image fusion: Joint at…
r/StableDiffusion TIER_2 English(EN) · /u/madtune22 · 2026-06-02 01:42

PixelDiT — 1.3B 像素空间扩散 Transformer，无 VAE，4GB 显存，现已 100% 兼容 diffusers 并支持 Qwen 编码器

<table> <tr><td> <a href="https://www.reddit.com/r/StableDiffusion/comments/1tuco68/pixeldit_13b_pixelspace_diffusion_transformer_no/"> <img alt="PixelDiT — 1.3B pixel-space diffusion transformer, no VAE, 4GB VRAM, now 100% diffusers compatible with Qwen encoder support" src="htt…

报道来源 [5]

为无损像素级图像传输调整扩散语言模型

为无损像素级图像传输调整扩散语言模型

HyperDiT：用于高保真像素空间扩散的超连接Transformer

PixelDiT：用于图像生成的像素扩散 Transformer，1.3B，无 VAE

PixelDiT — 1.3B 像素空间扩散 Transformer，无 VAE，4GB 显存，现已 100% 兼容 diffusers 并支持 Qwen 编码器

相关实体

相关话题