TextPro-SLM reduces speech LLM modality gap by enhancing input processing

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-07 09:32

Researchers have developed TextPro-SLM, a novel speech large language model (SLM) designed to minimize the modality gap between spoken and text-based inputs. Unlike previous approaches focusing on output generation, TextPro-SLM addresses the input side by making spoken language more akin to prosody-aware text LLMs. The model integrates a unified speech encoder with an LLM backbone, achieving state-of-the-art performance on paralinguistic understanding tasks with significantly less training data. AI

影响 This research could lead to more accurate and efficient speech-to-text models by focusing on input processing rather than output generation.

排序理由 The cluster contains an arXiv preprint detailing a new model and methodology.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Wenqian Cui, Xiao-Hui Li, Daxin Tan, Qiyong Zheng, Irwin King · 2026-05-08 04:00

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

arXiv:2605.05927v1 Announce Type: new Abstract: Speech large language models (SLMs) are typically built from text large language model (TLM) checkpoints, yet they still suffer from a substantial modality gap. Prior work has mainly attempted to reduce this gap from the output side…
arXiv cs.CL TIER_1 English(EN) · Irwin King · 2026-05-07 09:32

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Speech large language models (SLMs) are typically built from text large language model (TLM) checkpoints, yet they still suffer from a substantial modality gap. Prior work has mainly attempted to reduce this gap from the output side by making speech generation more text-like, but…

报道来源 [2]

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

相关实体

相关话题