DiffusionGemma 26B runs at 100 TPS on four AMD 7900 XTX GPUs

By PulseAugur Editorial · [1 sources] · 2026-06-11 15:18

A Reddit user shared their experience running DiffusionGemma 26B on a setup of four AMD 7900 XTX GPUs. They achieved generation speeds of up to 100 tokens per second, with an overall throughput of 45-60 tokens per second when accounting for prompt processing. The user detailed the extensive Docker command used to configure the vLLM environment for this specific hardware, noting that preparing the image consumed a significant amount of DeepSeek-V4-Pro tokens. AI

IMPACT Demonstrates performance of DiffusionGemma 26B on consumer-grade GPUs, offering insights for local LLM deployment.

RANK_REASON User-generated report on running a specific model on consumer hardware. [lever_c_demoted from research: ic=1 ai=0.7]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

DiffusionGemma 26B runs at 100 TPS on four AMD 7900 XTX GPUs

COVERAGE [1]

r/LocalLLaMA TIER_1 Français(FR) · /u/djdeniro · 2026-06-11 15:18

DifussionGemma 4 on 4x7900xtx

<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1u31zmk/difussiongemma_4_on_4x7900xtx/"> <img alt="DifussionGemma 4 on 4x7900xtx" src="https://preview.redd.it/qq3kr31q6o6h1.png?width=140&height=46&auto=webp&s=c6078221ad8c6af50dd258866dcbbf5e8db1…

COVERAGE [1]

DifussionGemma 4 on 4x7900xtx

RELATED ENTITIES

RELATED TOPICS