A new uncensored and balanced version of the Gemma4-12B-QAT model has been released, featuring a significant speed improvement of approximately 60% due to the integration of a multi-token-prediction (MTP) draft head for speculative decoding. This release boasts zero refusals on a comprehensive benchmark and offers multimodal capabilities, including vision support. The model is optimized for creative writing and role-playing, with Qwen3.6 noted as superior for agentic coding and tool use. AI
IMPACT This release offers a faster, uncensored option for local LLM deployments, potentially improving user experience in creative and role-playing applications.
RANK_REASON Release of a fine-tuned open-source model.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →