TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech
Researchers have developed two new methods for automatically annotating errors in transcribed speech. One approach, Speech Translation Error Labelling (STEL), uses existing text-only and multimodal LLMs to identify errors in speech translations, though current systems achieve about half the precision of humans. The other method, TalkTag, employs a fine-tuned LLM to automate fine-grained morphosyntactic error annotation in spoken-language transcripts, proving effective even with limited data. AI
IMPACT Automating error annotation in speech and transcripts could accelerate research and development in natural language processing and clinical linguistics.