Brief · PulseAugur

RESEARCH · Hugging Face Daily Papers English(EN) · 1w · [5 sources]

TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech

Researchers have developed two new methods for automatically annotating errors in transcribed speech. One approach, Speech Translation Error Labelling (STEL), uses existing text-only and multimodal LLMs to identify errors in speech translations, though current systems achieve about half the precision of humans. The other method, TalkTag, employs a fine-tuned LLM to automate fine-grained morphosyntactic error annotation in spoken-language transcripts, proving effective even with limited data. AI

IMPACT Automating error annotation in speech and transcripts could accelerate research and development in natural language processing and clinical linguistics.

LLM
TalkTag
Qwen2.5-Omni
Speech Translation Error Labelling
XCOMET