ProactiveLLM: Learning Active Interaction for Streaming Large Language Models
Researchers have introduced ProactiveLLM, a novel approach to enhance streaming large language models by enabling them to actively decide when to interact with incoming data. This method addresses the latency and computational inefficiencies of traditional LLMs and current streaming models. ProactiveLLM learns to gauge semantic sufficiency from partial inputs through mask-based streaming modeling and synchronized privileged self-distillation, eliminating the need for external alignment signals or annotations. Evaluations demonstrate significant reductions in interaction latency across text and speech tasks while preserving output quality. AI
IMPACT Reduces latency in streaming LLMs, potentially improving real-time applications and efficiency.