DiffusionGemma 26B A4B tuned for RTX 5090, boosting speed

By PulseAugur Editorial · [1 sources] · 2026-06-11 15:00

A user on Reddit shared their tuning results for the DiffusionGemma 26B A4B model, specifically focusing on performance with a RTX 5090 GPU. They detailed optimal parameters and provided speed comparisons for different quantization levels and context lengths. The tuning significantly improved throughput, with the Q4_K_M variant showing up to a 44% speedup for longer contexts. AI

IMPACT Demonstrates how parameter tuning can significantly enhance the performance of open-source models on consumer hardware.

RANK_REASON User-generated tuning results and performance benchmarks for an open-source model. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/giveen · 2026-06-11 15:00

DiffusionGemma 26B A4B results on my 5090

<div class="md"># DiffusionGemma 26B A4B — Tuning Results <a href="https://huggingface.co/unsloth/diffusiongemma-26B-A4B-it-GGUF">https://huggingface.co/unsloth/diffusiongemma-26B-A4B-it-GGUF</a> ## System - **GPU**: RTX 5090 (32 GB VRAM), …

COVERAGE [1]

DiffusionGemma 26B A4B results on my 5090

RELATED ENTITIES

RELATED TOPICS