Deterministic Inference across Tensor Parallel Sizes That Eliminates Training-Inference Mismatch
Researchers have developed Tree-Based Invariant Kernels (TBIK) to ensure deterministic inference in large language models, regardless of tensor parallel (TP) size. This addresses a critical issue where identical inputs can produce different outputs due to variations in TP size and floating-point arithmetic. TBIK guarantees bit-wise reproducibility by aligning reduction orders through a hierarchical binary tree structure, which is crucial for applications like LLM-as-a-judge and reinforcement learning. AI
IMPACT Ensures consistent LLM outputs for critical applications like RL and evaluation, removing a key barrier to reliable deployment.