ENTITY NVFP4

NVFP4

PulseAugur coverage of NVFP4 — every cluster mentioning NVFP4 across labs, papers, and developer communities, ranked by signal.

Total · 30d

5 over 90d

Releases · 30d

0 over 90d

Papers · 30d

4 over 90d

TIER MIX · 90D

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

TOOL · CL_29454 · May 12 · 15:13

SOAR framework boosts LLM accuracy with novel NVFP4 quantization

Researchers have introduced SOAR, a new post-training quantization framework designed to enhance the accuracy of NVFP4 quantization for large language models. SOAR employs Closed-form Joint Scale Optimization (CJSO) to …
TOOL · CL_25619 · May 8 · 12:32

New number formats boost AI direction preservation

Researchers have developed a new geometric framework to analyze how well low-precision number formats in machine learning preserve vector direction. The study analytically quantifies the suboptimality of standard format…
TOOL · CL_22142 · May 8 · 04:00

New 4/6 quantization method boosts LLM accuracy with adaptive scaling

Researchers have developed a new quantization method called Four Over Six (4/6) to improve the accuracy of low-precision numerical formats like NVFP4 for large language models. This technique adaptively scales blocks to…
RESEARCH · CL_03567 · Apr 25 · 22:41

Qwen3.6-35B model quantizations show FP8 quality worse than INT8, NVFP4 is a lie

A user on Reddit's LocalLLaMA community shared findings on the Qwen3.6-35B model, focusing on Kullback-Leibler (KLD) divergence metrics for different quantization formats like INT8, FP8, and NVFP4. The analysis, conduct…
RESEARCH · CL_03577 · Apr 25 · 15:42

llama.cpp and ik_llama.cpp add FP4 inference support for VRAM savings

The llama.cpp and ik_llama.cpp projects have both integrated support for FP4 (4-bit floating-point) inference, a significant advancement for model quantization. llama.cpp now includes NVFP4, an Nvidia-specific format, w…

SOAR framework boosts LLM accuracy with novel NVFP4 quantization

New number formats boost AI direction preservation

New 4/6 quantization method boosts LLM accuracy with adaptive scaling

Qwen3.6-35B model quantizations show FP8 quality worse than INT8, NVFP4 is a lie

llama.cpp and ik_llama.cpp add FP4 inference support for VRAM savings