Tajik
PulseAugur coverage of Tajik — every cluster mentioning Tajik across labs, papers, and developer communities, ranked by signal.
-
New benchmark study explores neural network performance on Tajik POS tagging
This paper introduces the first benchmark for part-of-speech tagging in the Tajik language, evaluating various neural network architectures. The study utilized the TajPersParallel corpus, focusing on context-independent…
-
New NLP guide covers tokenization to RLHF with open-weight models
A new preprint details a practical guide to the modern Natural Language Processing (NLP) pipeline, covering everything from tokenization to reinforcement learning from human feedback. The guide is structured as a reprod…
-
New study benchmarks machine transliteration models for Tajik-Farsi languages
This paper introduces a new benchmark for machine transliteration between Tajik and Farsi, developing a unique parallel corpus from diverse sources. The study compares six model architectures, including rule-based syste…