PulseAugur
EN
LIVE 02:33:33

Updated corpus EPIC-EuroParl-UdS aids translation and interpreting research

This paper introduces EPIC-EuroParl-UdS, an updated corpus of European Parliament speeches and their translations/interpretations. The resource has been refined with corrected metadata, improved linguistic annotations, and new layers like word alignment and surprisal indices. It supports research into information-theoretic approaches to language variation, comparing written and spoken modes, and analyzing translationese. A new study within the paper validates the spoken data and evaluates GPT-2 and machine translation models on predicting filler particles in interpreting. AI

IMPACT Provides a refined dataset for research into information-theoretic approaches in language, potentially improving machine translation and interpreting models.

RANK_REASON The item is a research paper detailing a new corpus and its application. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Updated corpus EPIC-EuroParl-UdS aids translation and interpreting research

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Maria Kunilovskaya, Christina Pollkl\"asener ·

    EPIC-EuroParl-UdS: Information-Theoretic Perspectives on Translation and Interpreting

    arXiv:2603.09785v3 Announce Type: replace Abstract: This paper introduces an updated and combined version of the bidirectional English-German EPIC-UdS (spoken) and EuroParl-UdS (written) corpora containing original European Parliament speeches as well as their translations and in…