Romanian GEC corpus and Transformer models introduced

By PulseAugur Editorial · [1 sources] · 2026-04-28 04:00

Researchers have developed a new dataset and neural models for Grammatical Error Correction (GEC) specifically for the Romanian language. This effort addresses the scarcity of resources for GEC in non-English languages, where existing tools are often limited. The best performing model achieved an F0.5 score of 53.76 by pre-training on artificially generated data and then fine-tuning on the newly created Romanian GEC corpus. AI

IMPACT Provides a new GEC dataset and models for Romanian, potentially improving NLP tools for the language.

RANK_REASON This is a research paper introducing a new dataset and models for a specific NLP task in a low-resource language.

Read on arXiv cs.CL →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Romanian GEC corpus and Transformer models introduced

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Teodor-Mihai Cotet, Stefan Ruseti, Mihai Dascalu · 2026-04-28 04:00

Neural Grammatical Error Correction for Romanian

arXiv:2604.23627v1 Announce Type: new Abstract: Resources for Grammatical Error Correction (GEC) in non-English languages are scarce, while available spellcheckers in these languages are mostly limited to simple corrections and rules. In this paper we introduce a first GEC corpus…

COVERAGE [1]

Neural Grammatical Error Correction for Romanian

RELATED ENTITIES

RELATED TOPICS