Unsloth releases optimized Gemma 4 assistant models

By PulseAugur Editorial · [1 sources] · 2026-06-09 16:12

Unsloth has released new quantized assistant models based on Gemma 4, optimized for faster inference. These models are available in various quantizations, including q8_0, and are accessible via Hugging Face repositories. The release aims to improve the performance and accessibility of Gemma 4 models for local use. AI

IMPACT Provides optimized versions of Gemma 4 models for local deployment, potentially improving performance for users.

RANK_REASON Release of optimized, quantized models based on an existing architecture. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/ParadigmComplex · 2026-06-09 16:12

Unsloth Gemma 4 QAT MTP assistant models now available

<div class="md"><p>Unsloth Gemma 4 QAT MTP assistant models now available</p> <p>They're both available as q8_0 models named <code>mtp-gemma-4-*.gguf</code> on the root of the directory and in both q8 and larger quants within an <code>MTP</code> folder.</p> <ul> <l…

COVERAGE [1]

Unsloth Gemma 4 QAT MTP assistant models now available

RELATED ENTITIES

RELATED TOPICS