PulseAugur
LIVE 21:46:59
research · [2 sources] ·

New benchmark and corpus advance Ancient Greek to Modern Greek translation

Researchers have developed a new benchmark and dataset for translating Ancient Greek to Modern Greek, a task previously hindered by a lack of parallel data. The AG-MG Parallel Corpus contains over 132,000 sentence pairs, created using a novel pipeline involving web scraping, advanced alignment techniques, and LLM-based error correction with Gemini 2.5 Flash. Experiments show that fine-tuning models like Llama-Krikri-8B and M2M100 significantly improves translation quality, with the best model achieving a BLEU score of 13.16. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Advances low-resource language translation, potentially enabling new applications in historical linguistics and digital humanities.

RANK_REASON The cluster describes a new academic paper introducing a novel benchmark and dataset for a specific machine translation task.

Read on arXiv cs.CL →

New benchmark and corpus advance Ancient Greek to Modern Greek translation

COVERAGE [2]

  1. arXiv cs.CL TIER_1 · Maria Giagkou ·

    Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models

    Machine Translation (MT) for Ancient Greek (AG) to Modern Greek (MG) is a low-resource task, constrained by the lack of large-scale, high-quality parallel data. We address this gap by introducing the AG-MG Parallel Corpus, a new resource containing 132,481 sentence-aligned pairs …

  2. Hugging Face Daily Papers TIER_1 ·

    Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models

    Machine Translation (MT) for Ancient Greek (AG) to Modern Greek (MG) is a low-resource task, constrained by the lack of large-scale, high-quality parallel data. We address this gap by introducing the AG-MG Parallel Corpus, a new resource containing 132,481 sentence-aligned pairs …