PulseAugur
LIVE 08:33:20
research · [2 sources] ·
0
research

New benchmark study explores neural network performance on Tajik POS tagging

This paper introduces the first benchmark for part-of-speech tagging in the Tajik language, evaluating various neural network architectures. The study utilized the TajPersParallel corpus, focusing on context-independent classification of isolated lexical units. Results indicated that the mBERT model, fine-tuned with LoRA, performed best, though all models struggled with morphological ambiguity without syntactic context. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Establishes a baseline for NLP tasks in Tajik, highlighting challenges in morphological ambiguity for low-resource languages.

RANK_REASON This is a research paper presenting a new benchmark and comparative study of neural architectures for a specific NLP task.

Read on arXiv cs.CL →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 · Mullosharaf K. Arabov ·

    Benchmarking POS Tagging for the Tajik Language: A Comparative Study of Neural Architectures on the TajPersParallel Corpus

    arXiv:2605.04576v1 Announce Type: new Abstract: This paper presents the first benchmark for the task of automatic part-of-speech (POS) tagging for the Tajik language. Despite the existence of multilingual language models demonstrating high effectiveness for many of the world's la…

  2. arXiv cs.CL TIER_1 · Mullosharaf K. Arabov ·

    Benchmarking POS Tagging for the Tajik Language: A Comparative Study of Neural Architectures on the TajPersParallel Corpus

    This paper presents the first benchmark for the task of automatic part-of-speech (POS) tagging for the Tajik language. Despite the existence of multilingual language models demonstrating high effectiveness for many of the world's languages, their capacity for grammatical analysis…