NLP
PulseAugur coverage of NLP — every cluster mentioning NLP across labs, papers, and developer communities, ranked by signal.
-
Author trains word embeddings from scratch using Dostoevsky novels
The author details their process of building word embeddings from scratch, using Dostoevsky's novels as a corpus of nearly one million words. This step follows their previous work on character-level tokenization and aim…
-
Low-resource NLP needs both cross-lingual transfer and specific data
A new paper argues that low-resource natural language processing (NLP) requires a combination of cross-lingual transfer and language-specific development. While cross-lingual transfer can boost performance using data fr…
-
New clustering method models annotator perspectives in NLP tasks
Researchers have developed a new agreement-based clustering technique to better model annotator perspectives in subjective Natural Language Processing tasks. This method aims to capture the nuances of disagreement among…
-
Transfer learning explained for LLMs, reducing data needs
Transfer learning is a key technique in LLM development, allowing pre-trained models to be adapted for new tasks with reduced data and computational needs. This method leverages existing knowledge from large datasets to…