Paper: AI models exploit translators' work as data without credit

By PulseAugur Editorial · [1 sources] · 2026-05-26 04:00

A new paper explores how translators' work has become a foundational data source for AI, particularly in machine translation. The research highlights that translation memories and parallel corpora, while crucial for training AI models, are often acquired without proper attribution or compensation to the translators. The paper introduces concepts like "appropriation without consumption" and the "invisible teacherisation" of translators to describe this process, examining legal frameworks and data supply chains to propose redistributive design solutions. AI

IMPACT Highlights ethical concerns regarding data sourcing for AI, potentially influencing future data collection and compensation models for human labor.

RANK_REASON The cluster contains a single academic paper discussing AI and data ethics. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Masaru Yamada · 2026-05-26 04:00

Translators as Invisible Teachers of AI: Copyright, Translation Memory, and the Political Economy of Linguistic Data

arXiv:2605.24842v1 Announce Type: new Abstract: This paper examines how the labour of translators has been transformed into foundational data capital for the age of artificial intelligence (AI). Translation memories (TM) and parallel corpora preserve a one-to-one correspondence b…

COVERAGE [1]

Translators as Invisible Teachers of AI: Copyright, Translation Memory, and the Political Economy of Linguistic Data

RELATED ENTITIES

RELATED TOPICS