ENTITY natural language processing

natural language processing

PulseAugur coverage of natural language processing — every cluster mentioning natural language processing across labs, papers, and developer communities, ranked by signal.

Total · 30d

10 over 90d

Releases · 30d

0 over 90d

Papers · 30d

10 over 90d

TIER MIX · 90D

RELATIONSHIPS

used by Word2vec 70%

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/2 · 32 TOTAL

TOOL · CL_30240 · May 13 · 18:01

Author trains word embeddings from scratch using Dostoevsky novels

The author details their process of building word embeddings from scratch, using Dostoevsky's novels as a corpus of nearly one million words. This step follows their previous work on character-level tokenization and aim…
TOOL · CL_28289 · May 11 · 15:24

Low-resource NLP needs both cross-lingual transfer and specific data

A new paper argues that low-resource natural language processing (NLP) requires a combination of cross-lingual transfer and language-specific development. While cross-lingual transfer can boost performance using data fr…
TOOL · CL_28320 · May 11 · 13:35

New ThreatCore benchmark highlights AI's struggle with implicit threats

Researchers have introduced ThreatCore, a new benchmark dataset designed for fine-grained threat detection in natural language processing. This dataset aims to provide a more consistent and standardized approach to iden…
TOOL · CL_27565 · May 11 · 04:04

New clustering method models annotator perspectives in NLP tasks

Researchers have developed a new agreement-based clustering technique to better model annotator perspectives in subjective Natural Language Processing tasks. This method aims to capture the nuances of disagreement among…
TOOL · CL_24524 · May 9 · 23:10

Transfer learning explained for LLMs, reducing data needs

Transfer learning is a key technique in LLM development, allowing pre-trained models to be adapted for new tasks with reduced data and computational needs. This method leverages existing knowledge from large datasets to…
RESEARCH · CL_23615 · May 8 · 23:10

LLMs Explained: Understanding Transformer Architecture and Applications

This article provides a foundational explanation of Large Language Models (LLMs), detailing their role in revolutionizing Natural Language Processing. It covers how LLMs are trained on extensive text data to understand …
TOOL · CL_22209 · May 8 · 04:00

AI models predict patient risk using clinical notes and temporal data

Researchers have developed two novel methods, HiTGNN and ReVeAL, to improve early risk prediction for chronic diseases using clinical language processing. HiTGNN, a hierarchical temporal graph neural network, effectivel…
RESEARCH · CL_22177 · May 8 · 04:00

Study reveals linguistic cues and annotator attitudes impact harmful language detection.

A new paper analyzes annotation variation in NLP datasets, focusing on harmful language detection. The research combines annotator characteristics with linguistic properties of the data to understand labeling discrepanc…
TOOL · CL_20796 · May 7 · 04:00

MambaBack architecture enhances whole slide image analysis with hybrid AI approach

Researchers have introduced MambaBack, a novel hybrid architecture designed to improve whole slide image (WSI) analysis in computational pathology. This new model combines the strengths of Mamba and MambaOut to better c…
RESEARCH · CL_20621 · May 7 · 04:00

Hybrid AI method boosts low-resource Vietnamese NER with LLM data augmentation

Researchers have developed a novel hybrid neurosymbolic framework to improve Named Entity Recognition (NER) for low-resource languages, specifically focusing on Vietnamese. This method combines rule-based processing wit…
COMMENTARY · CL_19115 · May 6 · 09:38

AI professionals urged to optimize skills section for job visibility

In the AI field, professionals often neglect their skills section on platforms like Mastodon, which functions as valuable free advertising space. Underutilizing this section by listing only a few items can lead to reduc…
TOOL · CL_18879 · May 6 · 04:00

EduCoder launches as open-source tool for educational dialogue annotation

Researchers have developed EduCoder, an open-source annotation system specifically designed for educational dialogue transcripts. This tool addresses the unique challenges of coding complex teacher-student and peer inte…
TOOL · CL_15481 · May 5 · 04:00

GenRecEdit framework tackles cold-start items in generative recommendation

Researchers have developed GenRecEdit, a novel framework designed to enhance generative recommendation systems by addressing the challenge of cold-start items. This method adapts model editing techniques, typically used…
TOOL · CL_15911 · May 5 · 04:00

SCARV framework enhances stable sample ranking in redundant NLP datasets

Researchers have developed SCARV, a new framework designed to improve the stability of sample rankings in Natural Language Processing datasets that contain redundancy. Existing methods often produce unstable rankings fo…
TOOL · CL_15943 · May 5 · 04:00

Researchers audit Wikipedia data quality for low-resource NLP tasks

A new study has audited the quality of Wikipedia data for low-resource and multilingual Natural Language Processing (NLP) tasks. Researchers found significant quality issues, including script and language contamination,…
MEME · CL_14936 · May 4 · 20:09

AI job interview for Puerto Rican Spanish NLP specialist turns out to be a scam

A user expressed excitement about a job opportunity requiring expertise in Puerto Rican Spanish and Natural Language Processing. However, their interview experience was disappointing, described as a "clanker."
RESEARCH · CL_14112 · May 1 · 16:45

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

Researchers have developed a new approach called Directed Social Regard (DSR) to analyze sentiment in online text. Unlike traditional sentiment analysis tools that provide a single positive, neutral, or negative score, …
RESEARCH · CL_11784 · May 1 · 04:00

New VCON framework enables smooth, iterative DNN compression with minimal accuracy loss

Researchers have introduced Vanishing Contributions (VCON), a novel framework designed to streamline the process of compressing deep neural networks. VCON enables a smoother, iterative transition to compressed models by…
RESEARCH · CL_11775 · May 1 · 04:00

New benchmarks reveal LLMs struggle with Arabic and symbolic financial reasoning

Researchers have introduced SAHM, a new benchmark designed to evaluate Arabic financial and Shari'ah-compliant reasoning capabilities in large language models. The benchmark includes over 14,000 expert-verified instance…
RESEARCH · CL_11779 · Apr 30 · 09:55

LLMs analyze language ideologies in Luxembourgish news comments

Researchers have developed a new method using sparse crosscoders to track the emergence and consolidation of linguistic features within large language models during pretraining. This technique, which includes a novel me…

Author trains word embeddings from scratch using Dostoevsky novels

Low-resource NLP needs both cross-lingual transfer and specific data

New ThreatCore benchmark highlights AI's struggle with implicit threats

New clustering method models annotator perspectives in NLP tasks

Transfer learning explained for LLMs, reducing data needs

LLMs Explained: Understanding Transformer Architecture and Applications

AI models predict patient risk using clinical notes and temporal data

Study reveals linguistic cues and annotator attitudes impact harmful language detection.

MambaBack architecture enhances whole slide image analysis with hybrid AI approach

Hybrid AI method boosts low-resource Vietnamese NER with LLM data augmentation

AI professionals urged to optimize skills section for job visibility

EduCoder launches as open-source tool for educational dialogue annotation

GenRecEdit framework tackles cold-start items in generative recommendation

SCARV framework enhances stable sample ranking in redundant NLP datasets

Researchers audit Wikipedia data quality for low-resource NLP tasks

AI job interview for Puerto Rican Spanish NLP specialist turns out to be a scam

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

New VCON framework enables smooth, iterative DNN compression with minimal accuracy loss

New benchmarks reveal LLMs struggle with Arabic and symbolic financial reasoning

LLMs analyze language ideologies in Luxembourgish news comments