Ollama has released version 0.6.8, introducing performance enhancements for the Qwen 3 MoE model on both NVIDIA and AMD hardware. This update also addresses several issues, including problems with GGML assertions, image input leaks, context cancellation, and out-of-memory handling. Additionally, the release includes improvements for file transfer tools and streaming progress indicators for platforms like Discord and Slack. AI
IMPACT Improves the performance and stability of local AI model execution, benefiting developers and users running models like Qwen 3.
RANK_REASON This is a software update for a tool that facilitates running AI models locally, not a release of a new frontier model or significant research.
Read on Mastodon — fosstodon.org →
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →