Researchers have developed a novel method to augment sign language translation (SLT) datasets using large language models (LLMs). This approach generates synthetic video-text pairs by extracting clips from existing gloss-annotated corpora and using an LLM to create new sentence glosses. The synthetic data significantly improves SLT performance, achieving a 2.92 BLEU-4 gain over a baseline, without requiring additional human annotation or generative video models. The study also found that optimizing for visual smoothness in clip transitions can be counterproductive, suggesting abrupt boundaries may offer implicit regularization. AI
IMPACT Enhances sign language translation capabilities by creating larger, more diverse training datasets, potentially improving accessibility for the deaf and hard-of-hearing community.
RANK_REASON Academic paper detailing a new methodology for corpus augmentation in sign language translation.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →