PulseAugur
实时 08:53:39
English(EN) Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

新框架增强文本到图像模型与人类偏好的对齐

研究人员开发了两个新颖的框架,DIDR和RTDMD,以改进文本到图像生成模型与人类偏好的对齐。DIDR(Diff-Instruct with Diffused Reward)是一个无数据框架,可在扩散轨迹的所有噪声水平上优化奖励,从而提高图像保真度。RTDMD是一种两阶段方法,将分布匹配蒸馏与奖励引导的强化学习相结合,用于少步生成器。这两种方法在偏好、美学和构图指标方面都显示出显著的改进,其中RTDMD仅用几步推理即可在SD3、SD3.5和FLUX.2等模型上取得最先进的结果。 AI

影响 这些框架提供了改进的AI图像生成与用户偏好对齐的方法,有可能以更少的计算资源产生更具美学吸引力和构图准确性的输出。

排序理由 该集群包含两篇研究论文,详细介绍了用于改进文本到图像生成模型的新颖框架。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

新框架增强文本到图像模型与人类偏好的对齐

报道来源 [4]

  1. arXiv cs.AI TIER_1 English(EN) · Junyi Wu, Weijian Luo, Haoyang Zheng, Runzhe Zhang, Guang Lin Haoyang Zheng Runzhe Zhang Guang Lin ·

    Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL

    arXiv:2605.24001v1 Announce Type: cross Abstract: Recent advances in one-step text-to-image generation have enabled real-time synthesis with remarkable efficiency and quality. Previous reinforcement learning methods for one-step generators combine image-space reward optimization …

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    通过奖励倾斜分布匹配增强少样本生成器

    RTDMD is a two-stage framework that combines distribution matching distillation with reward-guided reinforcement learning to improve few-step image generation alignment with human preferences.

  3. arXiv cs.CV TIER_1 English(EN) · Yushi Huang, Xiangxin Zhou, Ruoyu Wang, Chi Zhang, Jun Zhang, Tianyu Pang ·

    通过奖励倾斜分布匹配增强少样本生成器

    arXiv:2605.26108v1 Announce Type: new Abstract: Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging. We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a…

  4. arXiv cs.CV TIER_1 English(EN) · Tianyu Pang ·

    通过奖励倾斜分布匹配增强少样本生成器

    Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging. We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a two-stage framework that unifies distribution m…