FBK researchers have developed SpeechLLMs for the IWSLT 2026 Instruction Following shared task, focusing on both short-form and long-form speech instruction following. For short-form tasks, their model achieved a SIFS score of 2.0708. In long-form tasks, they explored various segmentation methods and introduced the HIFS score to evaluate performance, finding that a fixed 30-second segmentation yielded the best results with a score of 2.0663. Analysis indicated that hallucinations in long-form generation primarily involved repetitive insertions, though short-form capabilities remained largely intact. AI
IMPACT This research contributes to advancements in speech instruction following models, potentially improving how AI systems understand and execute commands given via speech.
RANK_REASON The cluster contains an academic paper detailing a new model submission to a shared task.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →