I have trained diffusion and flow matching models from scratch. Same architecture, same dataset, huge difference.
A user trained two generative image models, one using diffusion and another using flow matching, with identical architectures and datasets to compare their performance. The flow matching model demonstrated faster initial learning, producing recognizable images much earlier in the training process. Additionally, the flow matching model exhibited superior global structure, prompt adherence, and zero-shot generation capabilities compared to the diffusion model, despite using the same text encoder. AI
IMPACT Flow matching models show potential for faster training and improved generalization in generative image tasks.