ENTITY Int4

Int4

PulseAugur coverage of Int4 — every cluster mentioning Int4 across labs, papers, and developer communities, ranked by signal.

Total · 30d

8

8 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

6

6 over 90d

TIER MIX · 90D

research 6
tool 1
commentary 1

TOPICS

SENTIMENT · 30D

5 day(s) with sentiment data

RECENT · PAGE 1/1 · 8 TOTAL

RESEARCH · CL_111274 · Jun 25 · 00:47

Research: Compressing recursive reasoners for edge AI destroys global reasoning

A new research paper explores the challenges of compressing recursive reasoning models for deployment on edge hardware. The study found that standard compression techniques, such as INT4 pruning and distillation, preser…
RESEARCH · CL_109544 · Jun 24 · 07:54

Quantization of LLMs inflates reasoning token usage, researchers find

A new research paper highlights that while quantization techniques like INT4 and INT3 are effective at reducing the inference costs of large language models, they can unexpectedly inflate reasoning token usage. This phe…
TOOL · CL_100041 · Jun 19 · 06:39

Quantization causes 7-point task accuracy drop, bypassing perplexity

A company called Nexus Labs discovered that quantizing a fine-tuned 14B agent model to INT4 using GPTQ resulted in a significant 7-point drop in multi-step task completion accuracy, despite perplexity metrics showing on…
RESEARCH · CL_99958 · Jun 18 · 00:00

New UFP4 recipe tackles shrinkage bias in LLM FP4 pretraining

A new research paper introduces UFP4, a uniform 4-bit training recipe designed to address shrinkage bias in large language model pretraining. The study identifies that current non-uniform FP4 formats, like E2M1 used in …
COMMENTARY · CL_68647 · Jun 3 · 04:43

LLM serving latency stems from system queues, not compute

This article discusses how to optimize Large Language Model (LLM) serving performance, emphasizing that latency issues are typically caused by system bottlenecks rather than model compute. It highlights that queueing, n…
RESEARCH · CL_48868 · May 21 · 22:23

New methods enhance LLM quantization for efficiency and accuracy

Researchers have developed several new methods to improve the efficiency and accuracy of quantizing large language models (LLMs). These techniques aim to reduce the memory footprint and computational cost of LLMs, makin…
RESEARCH · CL_15836 · May 5 · 04:00

The Measure of Deception: An Analysis of Data Forging in Machine Unlearning

Two new research papers explore vulnerabilities and detection methods in machine unlearning, a process designed to remove specific data from trained models for privacy compliance. One paper, "DurableUn," reveals that lo…
RESEARCH · CL_03804 · Apr 25 · 16:08

AI safety research proposes formal framework for computational substrates

This series of posts explores the concept of 'substrates' in AI, which refers to the computational context layers necessary for implementing AI systems. The authors argue that current AI safety research lacks a clear fr…