PulseAugur
EN
LIVE 23:56:09

ComfyUI Krea 2 NVFP4 Quantization Shows Slower Performance Than fp8_scaled

A user on Reddit's r/StableDiffusion subreddit has reported that the NVFP4 quantization of the Krea 2 model, when used with ComfyUI, is significantly slower than the fp8_scaled version. The user observed this performance degradation on a 5060 Ti GPU and is seeking verification from other users, as they expected NVFP4 to offer a speed improvement, similar to its performance with the klein9b model. AI

IMPACT Potential performance bottleneck for users employing NVFP4 quantization with Krea 2 in ComfyUI.

RANK_REASON User-reported performance issue with a specific software and model quantization.

Read on r/StableDiffusion →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

ComfyUI Krea 2 NVFP4 Quantization Shows Slower Performance Than fp8_scaled

COVERAGE [1]

  1. r/StableDiffusion TIER_2 English(EN) · /u/KissMyShinyArse ·

    ComfyUI's nvfp4 quantization of Krea 2 is 2x slower than fp8_scaled

    <!-- SC_OFF --><div class="md"><p><code>krea2_turbo_nvfp4.safetensors</code> performs much worse than <code>krea2_turbo_fp8_scaled.safetensors</code> on my 5060 Ti. I'd expect NVFP4 to be at least twice as fast (which is true for klein9b), but somehow the opposite is true.</p> <p…