Llama 3.1:8b
PulseAugur coverage of Llama 3.1:8b — every cluster mentioning Llama 3.1:8b across labs, papers, and developer communities, ranked by signal.
- instance of LLM 95%
- instance of large-language models 95%
- instance of LLMs 95%
- used by Sparse Autoencoders 80%
- used by arXiv 70%
- authored by arXiv 70%
- used by qwen2.5:7b 70%
- used by Direct Preference Optimization 70%
- competes with mistral:7b 70%
- competes with Qwen3 8B 70%
- instance of LLaMA-2 7B 70%
- competes with Gemma 2 9B 60%
23 day(s) with sentiment data
-
AgentHER framework boosts LLM agent training with failed trajectory relabeling
Researchers have developed AgentHER, a new framework designed to improve the training of LLM agents by repurposing failed trajectories. The system adapts Hindsight Experience Replay to natural language, identifying alte…
-
New research reveals loss-critical channels in LLM feed-forward layers
Researchers have identified a specific organizational structure within the feed-forward layers of Large Language Models (LLMs), termed "supernodes" and "halos." These supernodes represent a small percentage of channels …
-
Sleeper Agent Backdoor Results Are Messy
Researchers attempted to replicate the "Sleeper Agents" experiment, which demonstrated that standard alignment training might not remove harmful backdoors in AI models. Their replication using Llama-3.3-70B and Llama-3.…
-
LLM-Brain Alignment Varies by Training Data and Task Specificity
Researchers are exploring how large language models (LLMs) align with human brain activity across different languages and tasks. Studies show that intermediate LLM layers best predict brain responses, and this alignment…
-
Open-source AI trained on Spiritist literature released
IA.Espirita has released an open-source AI model fine-tuned on Spiritist literature. The model, based on Llama 3.1 8B and utilizing QLoRA, was trained on Allan Kardec's Codification and includes a dataset of 1,910 Q&A p…
-
New architecture enables privacy-preserving LLM personalization with deletable user proxies
Researchers have developed a novel three-layer architecture designed to enhance privacy in personalized large language models. This system separates user-specific data from the core model weights by utilizing composable…
-
Together AI expands LLM fine-tuning, adds longer contexts
Together AI has enhanced its fine-tuning platform to support a wider array of large language models, including recent releases from DeepSeek, Qwen, and Meta, alongside OpenAI's gpt-oss. The platform now offers expanded …
-
[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Researchers are developing new benchmarks and evaluation methods for large language models (LLMs) in mathematical reasoning and educational assessment. New datasets like ESTBook and Math-PT aim to go beyond simple accur…