PulseAugur
实时 12:31:52
实体 GPTQ

GPTQ

PulseAugur coverage of GPTQ — every cluster mentioning GPTQ across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
8
90 天内 8
发布 · 30天
0
90 天内 0
论文 · 30天
6
90 天内 6
层级分布 · 90 天
关系
情绪 · 30 天

4 天有情绪数据

最近 · 第 1/1 页 · 共 8 条
  1. RESEARCH · CL_35775 ·

    llmcompressor 工具通过 FP8、GPTQ、SmoothQuant 实现 LLM 压缩

    一款名为 llmcompressor 的新开源工具允许开发人员压缩和基准测试指令微调的大型语言模型。该工具演示了如何应用 FP8、GPTQ 和 SmoothQuant 等训练后量化技术。此过程旨在减小模型尺寸并提高推理速度,同时评估性能权衡。

  2. TOOL · CL_30718 ·

    New paper details improved quantization for LLM matrix multiplication

    Researchers have published a paper detailing advancements in quantized matrix multiplication, specifically for large language models (LLMs). This second part of their work focuses on scenarios where the covariance matri…

  3. TOOL · CL_27223 ·

    ExLlamaV3, Unsloth Qwen, and Phi3 agent see major local AI updates

    This week's local AI news highlights significant updates to the ExLlamaV3 inference library, enhancing efficiency for running quantized Llama models on consumer GPUs. Additionally, new GGUF-quantized versions of Qwen 3.…

  4. RESEARCH · CL_15961 ·

    New methods accelerate LLMs via efficient sparsification, quantization, and compression

    Researchers have developed several new methods for compressing and optimizing large language models (LLMs) to improve efficiency and reduce computational costs. SparseForge focuses on efficient semi-structured sparsific…

  5. RESEARCH · CL_11807 ·

    New methods tackle LLM quantization for improved efficiency and accuracy

    Researchers have developed several new methods to improve the efficiency of large language models (LLMs) through quantization. OSAQ focuses on suppressing weight outliers using a low-rank Hessian property for accurate l…

  6. RESEARCH · CL_14463 ·

    New research explores LLM security, efficiency, and training optimization

    Researchers are developing novel methods to enhance the efficiency and security of Large Language Models (LLMs). One approach, "Widening the Gap," exploits outlier injection to compromise LLM quantization, demonstrating…

  7. RESEARCH · CL_01274 ·

    Hugging Face 推出用于高效 LLM 的先进量化技术

    研究人员正在开发先进的量化技术,以提高大型语言模型 (LLM) 的效率。AutoRound、LATMiX 和 GSQ 等新方法旨在减小模型大小和计算需求,从而能够在功能较弱的硬件上进行部署。这些方法侧重于优化模型权重和激活在较低比特宽度下的表示方式,其中一些方法已达到与更高精度模型相当的准确性。创新包括用于训练后量化的新颖校准策略和用于提高鲁棒性的可学习仿射变换。

  8. RESEARCH · CL_01035 ·

    Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models

    Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring…