用户正在讨论Gemma 4 QAT(量化感知训练)模型的性能和实用性,特别是将其与标准量化进行比较。虽然一些用户报告称通用任务的速度和质量有所提高,但其他用户认为QAT模型是一种倒退,尤其是在工具调用或编码等特定用例方面。正在进行基准测试以量化差异,结果喜忧参半,表明QAT模型并不总是优于更高比特的标准量化,有时还会表现出意外行为。
AI
<!-- SC_OFF --><div class="md"><p>I have enough RAM+VRAM to use gemma4 26b a4b up to q6_k quantizations w/ decent performance. Does anyone have any comparisons of the Q4_0 QATs (at 4-bits/wt) vs non-QATs at >4 bits/wt? (ex: q6_K)?</p> <p>KLD vs the originals wouldn't be approp…
<!-- SC_OFF --><div class="md"><p>I'm trying to find out if anyone has done any benchmarking comparing the Gemma 4 4-bit QAT models (via Unsloth) against standard 8-bit non-QAT quants.</p> <p>I know QAT is supposed to retain a ton of accuracy compared to the baseline BF16, but I'…
<!-- SC_OFF --><div class="md"><p>Hopefully this isn't too low effort of a post. I just finished the benchmarks and I figured I'd post them online because they certainly were insightful for me. I did not use any AI other than asking Gemini 3.1 Pro if it was statistically signific…
<!-- SC_OFF --><div class="md"><p>I spent the last few days trying to get consistent tool calling out of the new Gemma 4 12b QAT model and had to give up. When the model actually works, it works great, but for my specific use case and workflows it is just not for me. It is a majo…
<!-- SC_OFF --><div class="md"><p>Hey everyone!</p> <p>Not a native speaker, so please correct my english where I make mistakes, (can only learn from it!).</p> <p>While it's been out only for just a while, I wanted to post about it because it's been such a joy.</p> <p>So, to say …
<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1tzib7d/qat_variant_of_gemma4_26b_a4b_is_not_working_well/"> <img alt="QAT variant of Gemma4 26B A4B is not working well for me" src="https://preview.redd.it/albcm4kp0w5h1.png?width=140&height=140&crop…
<!-- SC_OFF --><div class="md"><p>I just came across the following post, where a user found some confusing divergence results between Q4 quants of the original and QAT models with a Q8/unquantized reference of the original model.</p> <p><a href="https://www.reddit.com/r/LocalLLaM…
<!-- SC_OFF --><div class="md"><p>No numbers. Not sure if anybody cares…</p> <p>I’ve run the UD version of Q4_k_m for a month. I talk to this model nicely, because it’s a functional nervous wreck. And initially I thought that might be an alignment thing, so I also have the hereti…