PulseAugur
EN
LIVE 15:43:38

Raon-Speech launches 9B parameter model for speech understanding and generation

Researchers have introduced Raon-Speech, a 9-billion parameter speech language model capable of understanding, answering, and generating speech in English and Korean. This model, trained on over 1.38 million hours of curated speech and text data, outperforms similarly sized audio foundation models on speech-centric tasks while maintaining strong text-based question-answering abilities. An extension, Raon-SpeechChat, further enhances real-time, full-duplex conversation capabilities through additional training on dialogue data, demonstrating strengths in turn-taking and interruption sensitivity. AI

IMPACT This new speech language model sets a new benchmark for speech understanding and generation, potentially improving human-computer interaction and real-time conversational AI.

RANK_REASON The cluster contains an arXiv paper detailing a new speech language model. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Beomsoo Kim, Changho Choi, Dohyun Kim, Dongki Lee, Ethan Ewer, Eunchong Kim, Gyeongman Kim, Haechan Kim, Hyeonghwan Kim, Inkyu Park, Jihun Yun, Jihwan Moon, Jiyun Kim, Joonghyun Bae, Junhyuck Kim, Minkyu Kim, Sehun Lee, Seungjun Chung, Sungwoo Cho, Dongm… ·

    Raon-Speech Technical Report

    arXiv:2605.23912v1 Announce Type: cross Abstract: We present Raon-Speech, a top-performing 9B-parameter speech language model (SpeechLM) for English and Korean speech understanding, answering, and generation, and Raon-SpeechChat, a high-performing full-duplex extension for natura…