实体
SmoothQuant
SmoothQuant
PulseAugur coverage of SmoothQuant — every cluster mentioning SmoothQuant across labs, papers, and developer communities, ranked by signal.
总计 · 30天
2
90 天内 2
发布 · 30天
0
90 天内 0
论文 · 30天
1
90 天内 1
层级分布 · 90 天
情绪 · 30 天
1 天有情绪数据
最近 · 第 1/1 页 · 共 2 条
-
llmcompressor tool enables LLM compression via FP8, GPTQ, SmoothQuant
A new open-source tool named llmcompressor allows developers to compress and benchmark instruction-tuned large language models. The tool demonstrates how to apply post-training quantization techniques such as FP8, GPTQ,…
-
Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models
Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring…