PulseAugur / Brief
EN
LIVE 15:48:24

Brief

last 24h
[1/1] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Scaling Conversational Hungarian ASR: The BEA-Dialogue+ Corpus

    Researchers have introduced BEA-Dialogue+, an expanded corpus for Hungarian conversational automatic speech recognition (ASR). This new dataset increases the available training data to 200 hours, relaxing split criteria to allow for more material while maintaining speaker separation. Evaluations using Whisper and FastConformer models demonstrate that the larger dataset, especially when combined with Serialized Output Training (SOT) fine-tuning, leads to significant improvements in transcription accuracy metrics. AI

    IMPACT Provides a larger, more challenging benchmark for Hungarian dialogue ASR, enabling better training and evaluation of transcription systems.