Ollama v0.23.1 adds Gemma 4 MTP for faster coding on Macs

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-05 21:37

Ollama has released version 0.23.1, introducing support for Gemma 4 MTP (Multi-token Processing) with speculative decoding on Macs. This enhancement can reportedly double the speed for the Gemma 4 31B model when performing coding tasks. The update also includes threading fixes for MLX and MLX-C. AI

影响 Improves performance for running specific models on Mac hardware, potentially speeding up development workflows.

排序理由 This is a software release for a tool that facilitates running models, not a release of a frontier model itself.

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-05 21:37

⚙️ New Ollama Release! ⚙️ Version: v0.23.1 Release Notes: ## Gemma 4 MTP (Multi-token Processing) for the MLX runner Gemma 4 MTP speculative decoding is now sup

⚙️ New Ollama Release! ⚙️ Version: v0.23.1 Release Notes: ## Gemma 4 MTP (Multi-token Processing) for the MLX runner Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks. ``" ollama run gemma4:…

链接 github.com/…/15845 github.com/…/pulls

报道来源 [1]

⚙️ New Ollama Release! ⚙️ Version: v0.23.1 Release Notes: ## Gemma 4 MTP (Multi-token Processing) for the MLX runner Gemma 4 MTP speculative decoding is now sup

相关实体

相关话题