Ollama has released version 0.6.8, introducing performance enhancements for the Qwen 3 MoE model on both NVIDIA and AMD hardware. This update also addresses several issues, including problems with GGML assertions, image input leaks, context cancellation, and out-of-memory handling. Additionally, the release includes improvements for file transfer tools and streaming progress indicators for platforms like Discord and Slack. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Improves the performance and stability of local AI model execution, benefiting developers and users running models like Qwen 3.
RANK_REASON This is a software update for a tool that facilitates running AI models locally, not a release of a new frontier model or significant research.