English(EN)A Pocket Offline Model for Simultaneous Speech Translation as CUNI Submission to IWSLT 2026
新方法利用仅解码器LLM增强同步语音翻译
作者PulseAugur 编辑部·[10 个来源]·
研究人员正在开发新的同步语音翻译方法,重点关注仅解码器的大型语言模型。一种名为AlignAtt4LLM的方法,通过调整这些模型的注意力机制来提高德语和意大利语等语言的翻译质量,即使在低延迟场景下也是如此。另一种名为DOA的方法,在SpeechLLMs内部使用自注意力机制,在无需重新训练的情况下获得长文本翻译的对齐信号。此外,一个名为Canary的系统,拥有10亿参数,提供了多种语言的离线同步翻译能力。
AI
arXiv:2606.04730v1 Announce Type: new Abstract: With the advent of Large Language Models, single-task and token-based multi-task models have evolved into instruction-based systems that infer task and target language implicitly from natural language prompts. This trend is reflecte…
With the advent of Large Language Models, single-task and token-based multi-task models have evolved into instruction-based systems that infer task and target language implicitly from natural language prompts. This trend is reflected in IWSLT's Instruction Following Track, which …
arXiv:2606.03967v1 Announce Type: cross Abstract: We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated sou…
arXiv:2606.03948v1 Announce Type: new Abstract: We implement simultaneous translation capability with the offline direct speech-to-text translation model Canary, using the state-of-the-art policy AlignAtt, and submit it to IWSLT 2026 Simultaneous Speech Translation Shared task fo…
A bilingual multi-attribute benchmark for instruction-guided speech editing is introduced to systematically evaluate speech modification capabilities across atomic and compositional tasks.
We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and Gemma-4 E4B-it translates that…
We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and Gemma-4 E4B-it translates that…
We implement simultaneous translation capability with the offline direct speech-to-text translation model Canary, using the state-of-the-art policy AlignAtt, and submit it to IWSLT 2026 Simultaneous Speech Translation Shared task for Czech to English and English to German and Ita…
arXiv:2605.31432v1 Announce Type: cross Abstract: Simultaneous speech-to-text translation (SimulST) generates translations while speech is still unfolding, requiring a streaming policy that decides when to read and when to write. State-of-the-art approaches rely on attention-base…
Simultaneous speech-to-text translation (SimulST) generates translations while speech is still unfolding, requiring a streaming policy that decides when to read and when to write. State-of-the-art approaches rely on attention-based encoder-decoder models where cross-attention pro…