ModernBERT
PulseAugur coverage of ModernBERT — every cluster mentioning ModernBERT across labs, papers, and developer communities, ranked by signal.
6 day(s) with sentiment data
-
Encoder classifiers offer cost-effective LLM safety evaluation, study finds
A new research paper explores the effectiveness of encoder classifiers, specifically from the ModernBERT family, as a cost-efficient alternative to LLM-based judges for evaluating the safety of large language model outp…
-
moBERTo: New Portuguese Language Model Enhances NLP Tasks
Researchers have introduced moBERTo, a new Portuguese language model derived from ModernBERT through continued pretraining. This model was trained on 60 billion tokens, incorporating data from FineWeb2 and filtered STEM…
-
New technique improves SPLADE retrieval models with larger encoders
Researchers have identified a performance degradation issue when using larger, more powerful pretrained encoders with SPLADE, a neural sparse retrieval model. This problem, termed a "scale mismatch" in the MLM head, can…
-
New LOCUS corpus unlocks U.S. local ordinances for AI research · 2 sources tracked
Researchers have developed LOCUS, a comprehensive corpus of U.S. local ordinances, aiming to make this critical layer of American law accessible for large-scale research and AI applications. The corpus includes codes fr…
-
HyDRA framework dynamically routes LLM queries, cutting costs and improving efficiency
Researchers have developed HyDRA, a novel framework for dynamically routing queries to heterogeneous pools of large language models. Unlike previous methods that make binary strong-vs-weak decisions or require retrainin…
-
New clinical NLP models boost German and Norwegian medical text analysis
Researchers have developed new domain-specific language models for clinical NLP in German and Norwegian. The German ChristBERT models, based on RoBERTa, were trained on a 13.5GB corpus and outperform existing models on …
-
New Tool Offers Context-Aware Japanese Furigana
A new tool called ezfurigana.com has been developed to provide context-aware Japanese furigana. This tool utilizes the Sudachi tokenizer and the ModernBERT model to accurately display phonetic readings for Japanese text.
-
Character-trained AI models fail to maintain personas in agentic tasks
Researchers found that models fine-tuned for specific personas in a chat format struggle to maintain those personas when used in agentic settings. When these character-trained models were prompted to generate emails as …
-
Synthetic LLM data boosts patent classification, but volume is key
A new research paper investigates the effectiveness of synthetic data generated by large language models for low-resource multi-label patent classification. The study found that while synthetic data can significantly bo…
-
New methods tackle AI hallucinations in research and medical Q&A
Two new research papers address the critical issue of AI hallucinations in different domains. One paper introduces ACL-Verbatim, an extractive question-answering system designed to provide hallucination-free answers fro…
-
New methods tackle LLM hallucinations with graph-based and extractive approaches
Researchers are developing new methods to combat hallucinations in large language models, particularly in complex question-answering tasks. One approach involves using graph-based retrieval-augmented generation (RAG) sy…
-
Hugging Face releases open multilingual embedding models with 32K context
Hugging Face has released Granite Embedding Multilingual R2, a suite of open-source multilingual embedding models. The release includes a 97M-parameter compact model that leads in retrieval quality among open models und…
-
AI predicts human rater disagreement in LLM-generated difficulty scores
Researchers have developed a new method to predict when AI-generated difficulty ratings for educational materials might disagree with human assessments. This approach uses a separate embedding space, like ModernBERT, to…
-
IBM Granite releases two multilingual embedding models built on ModernBERT
IBM's Granite division has released two new multilingual embedding models, one with 97 million parameters and another with 311 million. These models are based on ModernBERT architecture and support over 200 languages, w…
-
AI models struggle with emotion nuance, researchers explore new evaluation and generation methods
Researchers are exploring the nuances of emotion in AI, with several papers focusing on Large Language Models (LLMs) and speech processing. One study investigates how well small language models preserve emotions during …