Brief · PulseAugur

RESEARCH · arXiv cs.CL English(EN) · 1d · [2 sources]

How Far Can Machine Translation Quality Take You? Extrinsic Discourse Evaluation in Goal-Oriented Setups

A new research paper explores the limitations of current machine translation (MT) evaluation metrics by proposing extrinsic discourse evaluations. The study introduces an entity counting task to assess referential consistency and uses the Welfare Diplomacy game to evaluate communication and coordination in interactive settings. Findings indicate that high intrinsic MT quality does not guarantee downstream discourse success, and translation failures can significantly impact coordination in goal-oriented environments. AI

IMPACT Highlights the need for new evaluation methods that capture real-world performance of machine translation systems.

Welfare Diplomacy
arXiv
DagsHub
machine translation
alphaXiv
ScienceCast
CatalyzeX
Gotit.pub
Hugging Face
scite Smart Citations
Connected Papers
Litmaps