PulseAugur
实时 12:28:38
English(EN) How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance

研究人员开发流图引导,实现更快、更对齐的生成模型

研究人员开发了引导生成模型的新方法,特别是在文本到图像合成方面。其中一种方法,流图奖励引导(FMRG),将引导重新表述为最优控制问题,并使用流图进行高效的单轨迹积分和引导,显著加快速度,并且在更少的步数内达到与现有方法相当或更优的性能。另一种方法,LeapAlign,通过将长轨迹缩短为两次跳跃来解决流匹配模型微调的计算挑战,从而能够在任何生成步骤中进行高效稳定的更新,并在图像质量和对齐方面优于当前最先进的技术。此外,另一篇论文探讨了约束感知流匹配,提出了适应性方法来惩罚与约束集的距离或在约束集仅可查询的情况下使用随机化。 AI

影响 这些在生成模型引导和对齐方面的进展可能带来更高效、更可控的图像合成和其他生成任务。

排序理由 该集群包含多篇学术论文,详细介绍了生成模型和对齐的新方法。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

研究人员开发流图引导,实现更快、更对齐的生成模型

报道来源 [4]

  1. arXiv cs.AI TIER_1 English(EN) · Jerry Y. Huang, Justin Lin, Sheel Shah, Kartik Nair, Nicholas M. Boffi ·

    如何引导你的流程:通过流程图奖励引导实现少样本对齐

    arXiv:2604.27147v1 Announce Type: cross Abstract: In generative modeling, we often wish to produce samples that maximize a user-specified reward such as aesthetic quality or alignment with human preferences, a problem known as guidance. Despite their widespread use, existing guid…

  2. arXiv cs.LG TIER_1 English(EN) · Zhengyan Huan, Jacob Boerma, Li-Ping Liu, Shuchin Aeron ·

    通过随机探索实现约束感知流匹配

    arXiv:2508.13316v2 Announce Type: replace Abstract: We consider the problem of designing constraint-aware flow matching (FM) models that address the issue of constraint violations commonly observed in vanilla generative models. We consider two scenarios, viz.: (a) when a differen…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    如何引导你的流程:通过流程图奖励引导实现少样本对齐

    In generative modeling, we often wish to produce samples that maximize a user-specified reward such as aesthetic quality or alignment with human preferences, a problem known as guidance. Despite their widespread use, existing guidance methods either require expensive multi-partic…

  4. arXiv cs.CV TIER_1 English(EN) · Zhanhao Liang, Tao Yang, Jie Wu, Chengjian Feng, Liang Zheng ·

    LeapAlign:通过构建两步轨迹在任何生成步骤中进行训练后流匹配模型

    arXiv:2604.15311v2 Announce Type: replace Abstract: This paper focuses on the alignment of flow matching models with human preferences. A promising way is fine-tuning by directly backpropagating reward gradients through the differentiable generation process of flow matching. Howe…