PulseAugur
EN
LIVE 04:23:31

Ollama v0.31.1 improves Gemma 4 MoE model loading

Ollama has released version 0.31.1, which includes improvements to the loading of Gemma 4 Mixture of Experts (MoE) models. This update allows for more flexible loading of both quantized and non-quantized versions of these models by standardizing tensor naming conventions. AI

IMPACT Enhances the usability and flexibility of running advanced AI models like Gemma 4 MoE on local hardware.

RANK_REASON This is a software release for a tool that facilitates running AI models locally, not a core AI model release or research.

Read on Ollama — Releases →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Ollama v0.31.1 improves Gemma 4 MoE model loading

COVERAGE [1]

  1. Ollama — Releases TIER_1 Nederlands(NL) · pdevine ·

    v0.31.1: mlx: tighten up gemma4 moe loading code (#16964)

    <p>This change allows .experts.gate_proj / .up_proj / .down_proj tensor names to each<br /> be used for both quantized (i.e. nvfp4 and mxfp8) and non-quantized (bf16) models.<br /> Previous to this only non-quantized models used that tensor naming scheme.</p>