FormalASR converts spoken Chinese to formal text end-to-end

By PulseAugur Editorial · [1 sources] · 2026-06-09 04:00

Researchers have developed FormalASR, a novel end-to-end system designed to convert spoken Chinese directly into formal written text. This approach bypasses the need for a separate post-editing step by an LLM, reducing latency and computational costs. The system utilizes two models, 0.6B and 1.7B parameters, fine-tuned from Qwen3-ASR, and is trained on newly created large-scale datasets, WenetSpeech-Formal and Speechio-Formal. AI

IMPACT Offers a more efficient and direct method for transcribing spoken language into formal text, potentially improving downstream NLP applications.

RANK_REASON This is a research paper describing a new model and dataset. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Wanyi Ning, Yinshang Guo, Haitao Qian, Jiyuan Cheng, Weiyuan Feng, Yufei Zhang · 2026-06-09 04:00

FormalASR: End-to-End Spoken Chinese to Formal Text

arXiv:2605.19266v2 Announce Type: replace-cross Abstract: Automatic speech recognition (ASR) systems are typically optimized for verbatim transcription, which preserves disfluencies, filler words, and informal spoken structures that are often unsuitable for downstream writing-ori…

COVERAGE [1]

FormalASR: End-to-End Spoken Chinese to Formal Text

RELATED ENTITIES

RELATED TOPICS