Brief · PulseAugur

TOOL · arXiv stat.ML English(EN) · 20h

Deterministic Inference across Tensor Parallel Sizes That Eliminates Training-Inference Mismatch

Researchers have developed Tree-Based Invariant Kernels (TBIK) to ensure deterministic inference in large language models, regardless of tensor parallel (TP) size. This addresses a critical issue where identical inputs can produce different outputs due to variations in TP size and floating-point arithmetic. TBIK guarantees bit-wise reproducibility by aligning reduction orders through a hierarchical binary tree structure, which is crucial for applications like LLM-as-a-judge and reinforcement learning. AI

IMPACT Ensures consistent LLM outputs for critical applications like RL and evaluation, removing a key barrier to reliable deployment.

large language models
vLLM
Tree-Based Invariant Kernels
tensor parallel
Xinheng Ding