llama.cpp Gemma4 MTP support merged!
The llama.cpp project has merged support for Gemma 4 MTP, a new feature designed to enhance the performance of local Gemma models. This integration, spearheaded by a pull request from user am17an, aims to make personal Gemma deployments significantly faster. The update is now available within the ggml-org/llama.cpp repository. AI
IMPACT Enhances local LLM performance, making personal AI deployments faster and more efficient.