PulseAugur
EN
LIVE 01:00:10

2-bit GGUF models achieve 63% SWE-rebench pass rate with calibration

A new method has been developed to calibrate 2-bit quantized language models, specifically GGUF formats under 10GB, for agentic coding tasks. These calibrated models, such as Qwopus3.6-27B-Coder, achieve over 60% pass rates on the SWE-rebench benchmark. The calibration process utilizes real agentic coding logs and an importance matrix to maintain performance while significantly reducing model size and increasing decode speed. AI

IMPACT Enables more efficient deployment of capable coding agents on resource-constrained hardware.

RANK_REASON Novel research on model quantization and calibration for specific tasks. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

2-bit GGUF models achieve 63% SWE-rebench pass rate with calibration

COVERAGE [1]

  1. r/LocalLLaMA TIER_1 English(EN) · /u/professormunchies ·

    Calibrating 2-bit GGUFs (<10Gb) for agentic coding tasks

    <!-- SC_OFF --><div class="md"><p><strong>TL;DR:</strong> Small quantizations (&lt; 10 Gb) of <a href="https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder">Qwopus3.6-27B-Coder</a> calibrated on agentic coding logs with a bundled MTP that achieve &gt;60% pass rate on SWE-rebench.<…