TORQ framework enhances LLM accuracy with MXFP4 quantization

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed TORQ, a new framework for quantizing Large Language Models (LLMs) using the MXFP4 format. This method addresses accuracy degradation issues by analyzing and correcting imbalances in activation quantization. TORQ employs a two-level orthogonal rotation strategy to optimize the activation space, significantly improving LLM accuracy with 4-bit floating-point quantization. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Improves LLM efficiency and accuracy by enabling better low-bit quantization, potentially reducing inference costs.

RANK_REASON The cluster contains a research paper detailing a new method for LLM quantization. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Hugging Face Daily Papers →

paper
infra

COVERAGE [1]

Hugging Face Daily Papers TIER_1 · 2026-05-19 09:05

TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization

As Large Language Models (LLMs) advance toward practical deployment, the Microscaling FP4 (MXFP4) format has emerged as a cornerstone for next-generation low-bit inference, owing to its ability to balance high dynamic range with hardware efficiency. However, directly applying MXF…

COVERAGE [1]

TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization

RELATED ENTITIES

RELATED TOPICS