Need some help from someone who knows llama-cpp vulkan builds (docker in this case)
A user on r/LocalLLaMA encountered an issue where the reasoning phase of the Gemma4 31b model was being skipped in recent builds of llama-cpp. This functionality had previously worked, but a recent update related to the Gemma 4 12b unified model seemed to cause the problem. The user discovered that a new "thinking" dropdown, disabled by default in the chat interface, was the cause, and enabling it resolved the issue. AI
IMPACT Troubleshooting tip for users of llama-cpp and Gemma models.