Researchers have developed a new benchmark and dataset for translating Ancient Greek to Modern Greek, a task previously hindered by a lack of parallel data. The AG-MG Parallel Corpus contains over 132,000 sentence pairs, created using a novel pipeline involving web scraping, advanced alignment techniques, and LLM-based error correction with Gemini 2.5 Flash. Experiments show that fine-tuning models like Llama-Krikri-8B and M2M100 significantly improves translation quality, with the best model achieving a BLEU score of 13.16. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT Advances low-resource language translation, potentially enabling new applications in historical linguistics and digital humanities.
RANK_REASON The cluster describes a new academic paper introducing a novel benchmark and dataset for a specific machine translation task.