Universal Dependencies
PulseAugur coverage of Universal Dependencies — every cluster mentioning Universal Dependencies across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
New toolkit enhances syntactic analysis of child-adult language interactions
Researchers have developed a new open-source toolkit called CAIT, designed to improve the syntactic analysis of child-adult interactions within the CHILDES corpus. This toolkit includes a specialized dependency parser t…
-
New Marathi POS Tagging Dataset and BERT Models Released
Researchers have introduced L3Cube-MahaPOS, a new dataset for Marathi Part-of-Speech (POS) tagging, addressing the scarcity of annotated resources for the language. The dataset contains over 32,000 manually annotated se…
-
New Czech language treebanks released for NLP research · 4 sources tracked
Researchers have released two new papers detailing advancements in Czech language processing resources. The first paper introduces the Prague Dependency Treebank -- Consolidated 2.0 (PDT-C 2.0), an extensive, uniformly …
-
New method boosts Coptic-English machine translation with syntax
Researchers have developed a new in-context learning approach for low-resource machine translation of Coptic to English. This method incorporates syntactic information from Universal Dependencies parses, alongside bilin…
-
LLM Research Explores Syntactic Encoding and Reasoning Efficiency
Two new research papers explore the internal workings of Large Language Models (LLMs) and their reasoning capabilities. One paper investigates whether LLMs encode formal syntactic structures beyond what is captured by s…
-
New pipeline creates NLP resource for historical Greek parliamentary text
Researchers have developed a new, reproducible pipeline for creating a Universal Dependencies-style parsing resource for Katharevousa Greek parliamentary text. This workflow addresses the limitations of current NLP tool…
-
L2 Korean annotation uses parser agreement for human-in-the-loop workflow
Researchers have developed a new human-in-the-loop annotation workflow for L2 Korean using agreement between two parsers. This method leverages parser agreement as a proxy for annotation correctness, showing strong corr…
-
Gemma 4 QAT models spark debate over performance and utility
Users are discussing the performance and utility of Gemma 4 QAT (Quantization Aware Training) models, particularly comparing them to standard quantizations. While some users report improved speed and quality for general…
-
Researchers release comprehensive Russian legislative corpus for NLP tasks
Researchers have introduced a new corpus containing Russian primary and secondary legislation from 1991 to 2025. This dataset includes over 300,000 texts, totaling more than 194 million tokens. The corpus is offered in …