PulseAugur
LIVE 23:18:14
tool · [1 source] ·
1
tool

FormalASR system converts spoken Chinese to formal text end-to-end

Researchers have developed FormalASR, a novel end-to-end system designed to directly convert spoken Chinese into formal written text. This approach bypasses the need for a separate large language model (LLM) for post-editing, reducing latency and computational costs for on-device applications. FormalASR utilizes fine-tuned Qwen3-ASR models at 0.6B and 1.7B parameters, trained on newly created datasets, WenetSpeech-Formal and Speechio-Formal, achieving significant reductions in character error rate and improvements in text quality metrics. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Offers a more efficient, on-device solution for spoken-to-written text conversion, reducing reliance on multi-stage LLM pipelines.

RANK_REASON The cluster describes a new academic paper detailing a novel model and dataset for speech recognition. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 · Yufei Zhang ·

    FormalASR: End-to-End Spoken Chinese to Formal Text

    Automatic speech recognition (ASR) systems are typically optimized for verbatim transcription, which preserves disfluencies, filler words, and informal spoken structures that are often unsuitable for downstream writing-oriented applications. A common workaround is a two-stage ASR…