Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 4d

Is Human Annotation Necessary? Iterative MBR Distillation for Error Span Detection in Machine Translation

Researchers have developed a new method for detecting errors in machine translation that does not require human annotation. This approach, called Iterative MBR Distillation, uses a large language model to generate its own training data, effectively creating pseudo-labels. Experiments show that models trained with this self-generated data perform better than those trained on human-annotated datasets, particularly at identifying specific error spans. AI

IMPACT This method could significantly reduce the cost and improve the consistency of training machine translation evaluation models.

LLM
Machine Translation
Boxuan Lyu
WMT Metrics Shared Task
Iterative MBR Distillation