PulseAugur
EN
LIVE 13:01:29

Stable Diffusion VAEs from Wan2.1 and Qwen-Image found to be interchangeable

A user on Reddit has discovered that the variational auto-encoders (VAEs) from Wan2.1 and Qwen-Image are compatible and can decode each other's latent representations. While both VAEs share the same base architecture and latent space dimensionality, their different training objectives lead to distinct image outputs. The Wan-VAE, trained on video, tends to produce smoother images, whereas the Qwen-Image VAE, fine-tuned on static images, prioritizes preserving spatial details and sharp text rendering. The user has also released a ComfyUI node pack for further experimentation with these VAEs. AI

IMPACT Enables new creative workflows by allowing interchangeable use of VAEs from different image generation models.

RANK_REASON User-discovered compatibility between components of different models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/StableDiffusion →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Stable Diffusion VAEs from Wan2.1 and Qwen-Image found to be interchangeable

COVERAGE [1]

  1. r/StableDiffusion TIER_2 English(EN) · /u/lapula ·

    Blend images decoded by different VAEs for Krea2

    <table> <tr><td> <a href="https://www.reddit.com/r/StableDiffusion/comments/1ugycbr/blend_images_decoded_by_different_vaes_for_krea2/"> <img alt="Blend images decoded by different VAEs for Krea2" src="https://external-preview.redd.it/KyUFPoM9sfwFFkfRu4hPEIXjHS3BUqaH8NELvWQvUNE.pn…