Qwen 3.6 35B MoE: IQ3_M vs IQ4_NL for Aider/vibe coding?
A user on the r/LocalLLaMA subreddit is seeking advice on choosing between two quantization formats, IQ3_M and IQ4_NL, for the Qwen 3.6 35B MoE model. The decision hinges on balancing performance and VRAM usage, as the IQ4_NL format may exceed the user's 16GB VRAM and spill into system RAM. The user is primarily using the model for 'vibe coding' with tools like Ollama and Aider, and is weighing the potential loss in logic and syntax precision against the speed benefits of keeping the model entirely within VRAM. AI
IMPACT User-level discussion on optimizing local LLM performance for coding tasks.