New Drifting Preference Optimization fine-tunes one-step image generators

By PulseAugur Editorial · [3 sources] · 2026-06-01 17:31

Researchers have developed Drifting Preference Optimization (DrPO), a new method for fine-tuning one-step text-to-image generative models. This technique allows for efficient preference tuning of deterministic one-step generators, which are desirable for their speed. DrPO synthesizes an update direction from high- and low-scoring image samples, enabling training with various reward functions without requiring differentiability. AI

IMPACT Enables faster and more flexible fine-tuning of one-step image generation models.

RANK_REASON The cluster contains a research paper detailing a new method for generative models.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

New Drifting Preference Optimization fine-tunes one-step image generators

COVERAGE [3]

arXiv cs.LG TIER_1 English(EN) · Zhou Jiang, Yandong Wen, Zhen Liu · 2026-06-02 04:00

Drifting Preference Optimization for One-Step Generative Models

arXiv:2606.02521v1 Announce Type: new Abstract: One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-01 17:31

Drifting Preference Optimization for One-Step Generative Models

One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denoising trajectories, differentiable reward …
arXiv cs.LG TIER_1 English(EN) · Zhen Liu · 2026-06-01 17:31

Drifting Preference Optimization for One-Step Generative Models

One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denoising trajectories, differentiable reward …

COVERAGE [3]

Drifting Preference Optimization for One-Step Generative Models

Drifting Preference Optimization for One-Step Generative Models

Drifting Preference Optimization for One-Step Generative Models

RELATED ENTITIES

RELATED TOPICS