Can Large Language Models Reliably Correct Errors in Low-Resource ASR? A Contamination-Aware Case Study on West Frisian
Researchers explored the effectiveness of large language models (LLMs) in correcting errors for low-resource automatic speech recognition (ASR) systems, specifically focusing on West Frisian. Their study introduced a contamination-aware methodology using both public and a custom offline dataset to ensure the observed improvements were genuine. The findings indicate that LLM-based error correction generally enhances ASR performance, with one model even outperforming oracle word error rates, suggesting a true correction capability. AI
IMPACT Demonstrates LLMs' potential to improve speech recognition for under-resourced languages, opening new avenues for accessibility and data collection.