PulseAugur
实时 10:49:53
English(EN) Reinforcement Learning for Flow-Matching Policies with Density Transport

新研究通过理论和算法改进推动流匹配模型

研究人员为流匹配模型(一种生成模型)开发了新的理论基础和实用算法。其中一篇论文为神经网络参数化的条件速度场建立了收敛保证并提供了泛化界限。另一篇论文介绍了 Flow-DPPO,一种改进的强化学习方法,它用散度近邻约束取代了比例裁剪,以实现更稳定高效的训练。第三种方法 RLDT 使用具有密度传输的强化学习来微调流匹配策略以用于连续控制任务,其性能优于现有基线。 AI

影响 这些流匹配模型的进步可能导致更高效、更稳定的生成式AI在图像和视频生成等任务中得到应用,并在连续控制问题中获得更好的性能。

排序理由 多篇arXiv论文详细介绍了流匹配模型的新理论框架和算法改进。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 8 个来源。 我们如何撰写摘要 →

报道来源 [8]

  1. arXiv cs.AI TIER_1 English(EN) · Yihan He, Qishuo Yin, Yuan Cao, Jianqing Fan, Han Liu ·

    A Theory on Flow Matching with Neural Networks

    arXiv:2606.10089v1 Announce Type: cross Abstract: In this work, we develop theoretical foundation for flow matching with neural-network-parameterized conditional velocity fields. We establish convergence guarantees for gradient descent in the over-parameterized 2-layered ReLU neu…

  2. arXiv cs.LG TIER_1 English(EN) · Bowen Ping, Xiangxin Zhou, Penghui Qi, Minnan Luo, Liefeng Bo, Tianyu Pang ·

    Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

    arXiv:2606.11025v1 Announce Type: new Abstract: Recent work has demonstrated that online reinforcement learning (RL) can substantially improve the quality and alignment of flow matching models for image and video generation. Methods such as Flow-GRPO and CPS cast the denoising pr…

  3. arXiv cs.LG TIER_1 English(EN) · Tianyu Pang ·

    Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

    Recent work has demonstrated that online reinforcement learning (RL) can substantially improve the quality and alignment of flow matching models for image and video generation. Methods such as Flow-GRPO and CPS cast the denoising process as a Markov Decision Process and apply PPO…

  4. arXiv cs.AI TIER_1 English(EN) · Boshu Lei, Kostas Daniilidis, Antonio Loquercio ·

    基于密度传输的流匹配策略的强化学习

    arXiv:2606.08602v1 Announce Type: cross Abstract: We present an online reinforcement learning (RL) algorithm for fine-tuning flow-matching policies in continuous-control problems. Our key insight is to view RL-based policy improvement as a transport of action densities towards re…

  5. Hugging Face Daily Papers TIER_1 English(EN) ·

    Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

    Flow-DPPO replaces ratio clipping with divergence proximal constraints in flow matching models, improving training stability and multi-objective optimization through exact KL divergence computation.

  6. arXiv cs.AI TIER_1 English(EN) · Antonio Loquercio ·

    基于密度传输的流匹配策略的强化学习

    We present an online reinforcement learning (RL) algorithm for fine-tuning flow-matching policies in continuous-control problems. Our key insight is to view RL-based policy improvement as a transport of action densities towards regions of high reward, which naturally aligns with …

  7. arXiv cs.CV TIER_1 English(EN) · An Zhao, Shengyuan Zhang, Zhongjian Sun, Yixiang Zhou, Zejian Li, Ling Yang, Tianrun Chen, Lingyun Sun ·

    Mean Flow Distillation: Robust and Stable Distillation for Flow Matching Models

    arXiv:2606.11155v1 Announce Type: new Abstract: Flow Matching models have demonstrated strong performance across a wide range of generative tasks. However, their reliance on ODE-based iterative sampling incurs substantial computational overhead in inference, which limits their ap…

  8. arXiv cs.CV TIER_1 English(EN) · Lingyun Sun ·

    Mean Flow Distillation: Robust and Stable Distillation for Flow Matching Models

    Flow Matching models have demonstrated strong performance across a wide range of generative tasks. However, their reliance on ODE-based iterative sampling incurs substantial computational overhead in inference, which limits their applicability in real-time scenes. While distillat…