Nederlands(NL) v0.31.1: mlx: tighten up gemma4 moe loading code (#16964)

Ollama v0.31.1 改进 Gemma 4 MoE 模型加载

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-30 04:15

Ollama 发布了 0.31.1 版本，其中包括对 Gemma 4 混合专家 (MoE) 模型加载的改进。此次更新通过标准化张量命名约定，允许更灵活地加载这些模型的量化和非量化版本。 AI

影响增强了在本地硬件上运行 Gemma 4 MoE 等高级 AI 模型时的可用性和灵活性。

排序理由这是用于本地运行 AI 模型的工具的软件发布，而不是核心 AI 模型发布或研究。

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Ollama — Releases TIER_1 Nederlands(NL) · pdevine · 2026-06-30 04:15

v0.31.1: mlx: 优化 gemma4 moe 加载代码 (#16964)

<p>This change allows .experts.gate_proj / .up_proj / .down_proj tensor names to each<br /> be used for both quantized (i.e. nvfp4 and mxfp8) and non-quantized (bf16) models.<br /> Previous to this only non-quantized models used that tensor naming scheme.</p>