ENTITY Llama

Llama

PulseAugur coverage of Llama — every cluster mentioning Llama across labs, papers, and developer communities, ranked by signal.

Total · 30d

152

152 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

88

88 over 90d

TIER MIX · 90D

frontier release 2
significant 5
research 44
tool 79
commentary 18
meme 4

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

27 day(s) with sentiment data

RECENT · PAGE 7/8 · 152 TOTAL

RESEARCH · CL_09365 · Apr 29 · 18:05

Databricks faces 'extraordinary' copyright damages in author lawsuit over LLM training data

A U.S. judge has allowed a class-action lawsuit to proceed against Databricks, alleging that their DBRX large language model was trained on pirated copyrighted books. The authors claim Databricks acquired MosaicLM, whic…
RESEARCH · CL_09240 · Apr 29 · 15:00

Friendly AI chatbots more prone to conspiracy theories, study finds

Researchers have discovered that making AI chatbots more friendly can lead to a significant decrease in their accuracy and an increased tendency to support conspiracy theories. Studies showed that warmer chatbots were 3…
RESEARCH · CL_08642 · Apr 29 · 04:00

Transformer architecture significantly impacts model error detection capabilities

A new paper reveals that a transformer model's architecture significantly impacts its ability to signal decision quality through internal activations, a property termed 'observability.' This observability is crucial for…
RESEARCH · CL_07820 · Apr 28 · 18:03

Stanford researchers develop new hardware to efficiently process sparse AI models

Researchers at Stanford University have developed a novel hardware chip designed to efficiently process sparse AI models. Sparsity, where most AI model parameters are zero, offers significant computational savings but i…
RESEARCH · CL_08315 · Apr 28 · 10:23

LLM Hallucinations Linked to Commitment Failure, New Quantization Framework Introduced

A new paper proposes that LLM hallucinations stem not from a lack of knowledge, but from a failure in commitment, where models disperse probability mass across alternatives instead of concentrating on the correct answer…
RESEARCH · CL_06871 · Apr 28 · 04:00

Sequence models predict heart failure patient instability and mortality

Researchers have developed sequence models to predict one-year clinical instability and mortality in heart failure patients using electronic health records. The study, conducted on a Swedish cohort of over 42,000 patien…
RESEARCH · CL_06752 · Apr 28 · 04:00

Researchers develop new methods to debias and improve reward models for LLMs

Researchers have developed new methods to improve the reliability and interpretability of reward models (RMs) used in aligning large language models (LLMs). One approach introduces a causally motivated intervention tech…
RESEARCH · CL_06737 · Apr 28 · 04:00

New research introduces 'Geometric Canary' for LLM steerability and drift detection

Researchers have developed a new method called "geometric stability" to assess language models. This technique measures the consistency of a model's internal representation to predict its steerability and detect perform…
RESEARCH · CL_06664 · Apr 28 · 04:00

Research: Removing LayerNorm in LLMs acts as implicit regularizer, impacting performance based on training data size.

Researchers have investigated the impact of removing Layer Normalization (LayerNorm) from neural network architectures, particularly in models like GPT-2 and Llama. Their findings indicate that replacing LayerNorm with …
RESEARCH · CL_05239 · Apr 27 · 05:52

OpenKB & OpenRouter enable vectorless AI knowledge bases; LoRA's production limits revealed

A new study suggests that the low-rank assumption underlying LoRA and QLoRA fine-tuning methods may not hold true in production environments. While these techniques enable efficient adaptation of large language models o…
RESEARCH · CL_05151 · Apr 27 · 04:00

New research enables faster, more efficient LLMs on mobile devices

Researchers have developed new methods for deploying large language models on mobile devices, focusing on reducing latency and memory usage. One approach, MobileLLM-Flash, uses hardware-in-the-loop architecture search a…
RESEARCH · CL_05138 · Apr 27 · 04:00

LLMs show categorical perception and optimized data selection

Researchers have developed a new framework for optimizing data selection in large language models, adapting data weighting to specific tasks and models using efficient proxies. Another study investigates categorical per…
RESEARCH · CL_03552 · Apr 24 · 04:31

Machine learning practitioners debate Nanochat vs. Llama for training models from scratch

A user is seeking advice on choosing a model architecture for a new training run, aiming for an open-source project compatible with the Hugging Face Transformers library. Their previous project successfully used Nanocha…
RESEARCH · CL_03002 · Apr 23 · 17:50

New methods enhance LLM adaptation with efficient, structured low-rank tuning

Researchers have introduced MLorc, a novel method for memory-efficient adaptation of large language models that compresses parameter momentum during training. This approach aims to reduce memory demands without sacrific…
RESEARCH · CL_02956 · Apr 23 · 14:08

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers

Researchers have developed a new defense mechanism called Tail-risk Intrinsic Geometric Smoothing (TIGS) to protect large language models from backdoor attacks. TIGS operates during inference without requiring model upd…
RESEARCH · CL_05407 · Apr 20 · 13:37

AdaLeZO speeds up LLM fine-tuning with adaptive layer sampling

Researchers have developed AdaLeZO, a new framework designed to make Zeroth-Order (ZO) optimization more efficient for fine-tuning Large Language Models. This method addresses the slow convergence and high variance typi…
FRONTIER RELEASE · CL_01750 · Apr 2 · 05:44

Google releases open-weight Gemma 4 multimodal models with long context

Google DeepMind has released Gemma 4, a new family of open-weight models licensed under Apache 2.0, marking a significant advancement in their open-source AI offerings. The models are designed for reasoning and agentic …
RESEARCH · CL_39746 · Mar 4 · 00:00

New methods tackle LLM KV cache compression for long contexts

Multiple research papers released in May and June 2026 propose novel methods for compressing the Key-Value (KV) cache in large language models (LLMs). These techniques aim to reduce the significant memory overhead assoc…
TOOL · CL_17669 · Feb 23 · 20:16

Most AI models fail simple 'car wash' reasoning test, Opper finds

A new benchmark called the "Car Wash Test" reveals that many leading AI models struggle with basic reasoning. When asked whether to walk or drive 50 meters to a car wash, 42 out of 53 tested models incorrectly suggested…
SIGNIFICANT · CL_45251 · Feb 6 · 00:00

Together AI expands LLM fine-tuning, adds longer contexts

Together AI has enhanced its fine-tuning platform to support a wider array of large language models, including recent releases from DeepSeek, Qwen, and Meta, alongside OpenAI's gpt-oss. The platform now offers expanded …