PulseAugur / Brief
EN
LIVE 12:09:09

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation

    Researchers have introduced HK-LegiCoST, a new parallel corpus designed for speech translation research. This corpus features over 600 hours of Cantonese audio, its corresponding traditional Chinese transcript, and an English translation, all aligned at the sentence level. A key challenge addressed was aligning non-verbatim transcripts, which are common when spoken and written language forms differ significantly, making it suitable for languages with vernacular and dialectal speech variations. The corpus enables the demonstration of competitive speech translation baselines and cross-corpus results. AI