llama.cpp B9406 fixes MTP crash with MoE vision models

By PulseAugur Editorial · [1 sources] · 2026-05-29 13:14

The llama.cpp project has released version B9406, which includes a fix for a crash related to MTP (multimodal processing) with MoE (mixture of experts) models and vision capabilities. This specific issue affected users attempting to run models like Qwen3.6-35B-A3B when processing image chunks. The update aims to resolve the GGML_ASSERT crash encountered in the get_rows function. AI

IMPACT Resolves a specific bug for users running multimodal MoE models locally, improving usability.

RANK_REASON This is a software release for an open-source project that improves functionality for running specific types of models. [lever_c_demoted from research: ic=1 ai=0.7]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

r/LocalLLaMA TIER_1 (CA) · /u/Bulky-Priority6824 · 2026-05-29 13:14

Llama.cpp B9406 MTP mmproj fix

<div class="md"><a href="https://github.com/ggml-org/llama.cpp/releases/tag/b9406">B9406</a> Been waiting for this one. Building now. Report your results if you test! <blockquote> GGML_ASSERT(i01 >= 0 && i01 < ne01) crash in ge…

COVERAGE [1]

Llama.cpp B9406 MTP mmproj fix

RELATED ENTITIES

RELATED TOPICS