PulseAugur
EN
LIVE 09:43:32

LibriConvo corpus advances ASR and speaker diarization

Researchers have developed LibriConvo, a new synthetic conversational speech corpus designed to improve automatic speech recognition (ASR) and speaker diarization systems. The corpus was created by adapting the Speaker-Aware Simulated Conversation framework, processing existing English CallHome data for conversational timing and using LibriTTS utterances grouped by book for semantic continuity. LibriConvo contains over 240 hours of audio featuring 830 speakers, and baseline results show that models like Sortformer and a fine-tuned Fast Conformer-CTC XLarge outperform existing systems on this benchmark. AI

IMPACT Provides a new benchmark for evaluating and improving multi-speaker speech processing systems.

RANK_REASON The cluster contains a research paper detailing a new synthetic dataset and benchmark for speech processing tasks. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · M\'at\'e Gedeon, P\'eter Mihajlik ·

    LibriConvo: Simulating Conversations from Read Literature for ASR and Diarization

    arXiv:2510.23320v2 Announce Type: replace-cross Abstract: We introduce LibriConvo, a synthetic conversational speech corpus for speaker diarization and automatic speech recognition (ASR), built by instantiating the previously proposed Speaker-Aware Simulated Conversation (SASC) f…