Brief

last 24h

[50/2976] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · Hugging Face Blog English(EN) · 34mo

Open-sourcing Knowledge Distillation Code and Weights of SD-Small and SD-Tiny

Hugging Face has released the code and weights for SD-Small and SD-Tiny, two smaller versions of its Stable Diffusion model. These models were created using knowledge distillation, a technique that trains a smaller model to mimic the behavior of a larger one. The goal is to make powerful image generation models more accessible and efficient for researchers and developers. AI
RESEARCH · Medium — MLOps tag English(EN) · 34mo · [63 sources]

Building Secure AI Gateways with MLflow AI Gateway

Google Research has introduced ReasoningBank, a novel framework designed to enhance AI agents' ability to learn from their experiences, both successes and failures, after deployment. This system distills generalizable reasoning strategies from past interactions, allowing agents to continuously improve and avoid repeating mistakes. Separately, new research explores optimizing multi-agent communication through latent representations and introduces Agent Evolving Learning (AEL) for agents operating in open-ended environments, focusing on how to effectively use remembered information. Additionally, DeepSeek has released preview models of its V4 series, offering large context windows and advanced capabilities at a significantly lower cost than comparable frontier models. AI

IMPACT New frameworks for agent learning and memory, alongside cost-effective frontier models, could accelerate AI adoption in complex tasks and personalized applications.
- MLflow
- OpenAI
- Portkey
- LiteLLM
- MLflow AI Gateway
- Claude Opus 4.7
- GPT-5.5
- Gemini
- Anthropic
- OpenRouter
- ReasoningBank
- DeepSeek
- DeepSeek-V4-Pro
- DeepSeek-V4-Flash
- AI agents
- LLM
- Hugging Face
- AgenticQwen
- Nemobot
- DiffMAS
- Agent Evolving Learning (AEL)
- Memora
- Google
RESEARCH · Practical AI English(EN) · 35mo

There's a new Llama in town

Meta AI has released Llama 2, a new large language model that is expected to significantly impact the LLM landscape. This release includes a new NeRF model called Zip-NeRF, capable of generating 3D scenes from 2D images. The hosts also discussed new functionalities from OpenAI and compared them with Anthropic's Claude 2. AI
RESEARCH · Hugging Face Blog English(EN) · 35mo

Happy 1st anniversary 🤗 Diffusers!

The Hugging Face Diffusers library celebrated its first anniversary, marking a significant milestone in the open-source AI community. Since its launch, Diffusers has become a pivotal tool for researchers and developers working with diffusion models, enabling easier experimentation and deployment of generative AI applications. The library's success highlights the growing importance of accessible and collaborative platforms for advancing AI research and development. AI
RESEARCH · Hugging Face Blog English(EN) · 35mo · [3 sources]

Welcoming Llama Guard 4 on Hugging Face Hub

Meta AI has released Llama 4, a new family of open-source large language models, available on Hugging Face. This release includes Llama Guard 4, a model specifically designed for safety, and two other models, Maverick and Scout. The availability of these models on Hugging Face Hub facilitates broader access and experimentation within the AI community. AI
RESEARCH · Hugging Face Blog English(EN) · 35mo

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Hugging Face has released an open-source model called "os-llms" designed for text generation. This model aims to foster a more collaborative and accessible ecosystem for large language models. The release emphasizes community involvement and aims to democratize access to powerful AI tools. AI
RESEARCH · Practical AI English(EN) · 35mo

Cambrian explosion of generative models

The "Practical AI" podcast discusses the recent surge in generative models, highlighting open-source advancements like Stable Diffusion XL and Zeroscope XL. Hosts Daniel and Chris predict that open models will eventually dominate the AI landscape, similar to open-source software. They also address the emerging challenges associated with this rapid progress, including cybersecurity risks, impacts on productivity, and broader cultural implications. AI
RESEARCH · Hugging Face Blog English(EN) · 36mo

What's going on with the Open LLM Leaderboard?

The Hugging Face Open LLM Leaderboard has updated its evaluation methodology to include the MMLU benchmark, a comprehensive test of language model knowledge across 57 subjects. This change aims to provide a more robust assessment of model capabilities by incorporating a wider range of academic and professional domains. The leaderboard now uses a weighted average of MMLU scores alongside existing benchmarks to rank open-source large language models. AI
RESEARCH · Hugging Face Blog English(EN) · 36mo

Fine-Tune MMS Adapter Models for low-resource ASR

Hugging Face has released new adapter models for their MMS (Massively Multilingual Speech) ASR system. These adapters are designed to improve performance on low-resource languages, enabling better speech recognition for a wider range of linguistic communities. The release focuses on making ASR technology more accessible and effective for languages with limited existing training data. AI
RESEARCH · Hugging Face Blog English(EN) · 36mo

Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)

Researchers have demonstrated the effectiveness of Transformer models for time series forecasting tasks. The Autoformer architecture, specifically designed for this purpose, shows strong performance by decomposing the time series into seasonal and trend components. This approach allows for more accurate predictions by handling complex temporal dependencies. AI
RESEARCH · Hugging Face Blog (CA) · 36mo

Can foundation models label data like humans?

Hugging Face's Open LLM Leaderboard is exploring the use of large language models (LLMs) for data labeling, aiming to replicate human-level accuracy. This approach could significantly speed up and reduce the cost of data annotation for training AI models. The blog post discusses the potential and challenges of using LLMs in this capacity, particularly in comparison to traditional human annotators. AI
RESEARCH · Latent Space Podcast English(EN) · 36mo

From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude

Amplitude, a company known for its product analytics, is focusing heavily on integrating AI into its offerings. They are exploring methods beyond traditional Reinforcement Learning from Human Feedback (RLHF), which relies on explicit, often costly, and potentially biased user input. Instead, Amplitude advocates for learning from real user behavior within products, citing examples like GitHub Copilot and Midjourney, where implicit feedback is gathered naturally through user interaction. This approach aims to provide more authentic and cost-effective data for training AI models, potentially making AI analytics more crucial than AI itself. AI
RESEARCH · Hugging Face Blog English(EN) · 36mo

The Falcon has landed in the Hugging Face ecosystem

The Falcon large language model has been integrated into the Hugging Face ecosystem. This integration makes the model more accessible to developers and researchers. Falcon is known for its strong performance on various benchmarks and its open-source nature. AI
RESEARCH · OpenAI News English(EN) · 36mo

Improving mathematical reasoning with process supervision

OpenAI has developed a new method called process supervision to improve AI's mathematical reasoning capabilities. This technique rewards each correct step in a problem-solving process, rather than just the final answer, leading to better performance and reduced hallucinations. The company found that process supervision not only enhances accuracy but also offers alignment benefits by directly training models to produce human-endorsed reasoning chains. OpenAI has released its dataset to encourage further research into this promising approach. AI
RESEARCH · EleutherAI Blog English(EN) · 37mo · [2 sources]

🐶Safetensors audited as really safe and becoming the default

The safetensors library, developed by Hugging Face in collaboration with EleutherAI and Stability AI, has undergone a security audit by Trail of Bits, confirming its safety. This audit allows the organizations to move towards making safetensors the default format for saving and loading machine learning models, replacing the less secure pickle format used by PyTorch. The library offers benefits such as faster loading times and lazy loading capabilities, and will now be installed by default in the transformers library. AI
- Hugging Face
- EleutherAI
- Stability AI
- Trail of Bits
- safetensors
- LLaMA
- transformers
- PaddlePaddle
- NumPy
- JAX
- TensorFlow
- PyTorch
- StarCoder
RESEARCH · Latent Space Podcast English(EN) · 37mo

MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

MosaicML has released MPT-7B, an open-source transformer model trained on one trillion tokens that matches LLaMA-7B's quality and is commercially licensed. This model boasts an impressive context length of up to 84,000 tokens, significantly exceeding limitations found in models like GPT-3. MosaicML also open-sourced its LLM Foundry codebase used for training and evaluation, alongside three fine-tuned versions of MPT-7B, including one specialized for long-form storytelling. AI
RESEARCH · Practical AI English(EN) · 37mo

Creating instruction tuned models

Erin Mikail Staples discussed the creation of instruction-tuned Large Language Models at ODSC East. The conversation focused on the critical role of human feedback in this process. Staples also highlighted the significance of open data and practical tools for data annotation and fine-tuning custom generative AI models. AI
RESEARCH · Hugging Face Blog English(EN) · 37mo

Smaller is better: Q8-Chat, an efficient generative AI experience on Xeon

Hugging Face has released Q8-Chat, a new generative AI model optimized for Intel Xeon CPUs. This model aims to provide an efficient AI experience directly on standard server hardware without requiring specialized GPUs. The development focuses on making powerful AI capabilities more accessible and cost-effective for a wider range of applications. AI
RESEARCH · Hugging Face Blog English(EN) · 37mo

Run a Chatgpt-like Chatbot on a Single GPU with ROCm

Hugging Face has released a new guide detailing how to run a ChatGPT-like chatbot on a single AMD GPU using ROCm. This enables users with consumer-grade hardware to deploy powerful conversational AI models locally. The guide focuses on optimizing performance and accessibility for individuals and smaller organizations. AI
RESEARCH · Latent Space Podcast English(EN) · 37mo · [2 sources]

RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious

The RWKV (Receptance Weighted Key Value) project introduces a novel architecture that revives Recurrent Neural Networks (RNNs) while incorporating advantages typically found in Transformers. This approach aims to overcome the scaling limitations of traditional Transformers, particularly in training and inference, while maintaining competitive performance on reasoning benchmarks. The RWKV project is characterized by its distributed, international, and largely volunteer-driven community, drawing parallels to early EleutherAI efforts. AI
- RWKV
- Transformer
- RNN
- Hugging Face
- EleutherAI
- Eugene Cheah
- UIlicious
- Attention Is All You Need
- Apple
- GPT
- The Pile
- Alpaca
- OpenAI
RESEARCH · Hugging Face Blog English(EN) · 37mo

Creating a Coding Assistant with StarCoder

Hugging Face has released StarCoder, a new large language model specifically trained for code generation. This model is built on the StarChat architecture and has been trained on a massive dataset of permissively licensed code from GitHub. StarCoder aims to provide developers with a powerful and accessible tool for various coding tasks. AI
RESEARCH · Latent Space Podcast English(EN) · 37mo

Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit

Replit has open-sourced its new code-focused large language model, replit-code-v1-3b. This model, which is significantly smaller than OpenAI's Codex, reportedly outperforms it on the HumanEval benchmark when fine-tuned on Replit's data. The release was discussed in an interview with Replit's Head of AI, Reza Shabani, who detailed the journey of training the model and its potential applications for developers. AI
RESEARCH · Practical AI English(EN) · 37mo

Large models on CPUs

Mark Kurtz discusses the significant advancements in optimizing large AI models for CPU inference, highlighting that a substantial portion of model parameters often do not impact outputs. This optimization work, particularly through tools like Neural Magic's SparseML and SparseGPT, enables running complex generative AI models on standard hardware, reducing the reliance on expensive GPUs and making AI more accessible. AI
RESEARCH · Latent Space Podcast English(EN) · 38mo

Mapping the future of *truly* Open Models and Training Dolly for $30 — with Mike Conover of Databricks

Databricks has released Dolly 2.0, an instruction-following large language model that is fully open source and commercially viable. Unlike LLaMA, Dolly 2.0's license permits business use, addressing a key limitation of previous open models. The model was fine-tuned on a human-generated instruction dataset and can be customized for specific data and styles, with Databricks offering a notebook to facilitate this process for approximately $30 in 30 minutes. AI
RESEARCH · Hugging Face Blog English(EN) · 38mo

Running IF with 🧨 diffusers on a Free Tier Google Colab

Hugging Face has released a guide on how to run the new open-source IF (Image-to-Image) model using their diffusers library on a free tier Google Colab instance. This allows users to experiment with the model's capabilities without requiring powerful local hardware. The guide provides practical steps for setting up the environment and running inference, making advanced image generation accessible to a wider audience. AI
RESEARCH · Hugging Face Blog English(EN) · 38mo

Graph Classification with Transformers

Hugging Face has released a new blog post detailing how to perform graph classification tasks using Transformer models. The post provides a practical guide, likely aimed at researchers and developers, on leveraging the power of Transformers for analyzing graph-structured data. This approach could open new avenues for applying advanced deep learning techniques to domains where graph data is prevalent. AI
RESEARCH · Latent Space Podcast English(EN) · 38mo

Segment Anything Model and the Hard Problems of Computer Vision — with Joseph Nelson of Roboflow

Meta AI has released its Segment Anything Model (SAM), a significant advancement in computer vision, which includes the model, weights, data, and a demo website. This open-source release is notable for its extensive dataset, containing significantly more images and masks than previous datasets. The podcast features Joseph Nelson of Roboflow discussing SAM's capabilities, including its zero-shot transfer and promptability, and demonstrating its integration into Roboflow's platform. The discussion also touches upon the broader landscape of multimodal AI and the remaining challenges in computer vision. AI
RESEARCH · Hugging Face Blog English(EN) · 38mo

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Hugging Face has released StackLLaMA, an open-source model trained on code and natural language. This model is designed to assist developers with coding tasks, offering capabilities such as code generation and explanation. The release aims to provide a powerful, accessible tool for the AI development community. AI
RESEARCH · EleutherAI Blog English(EN) · 38mo

Exploratory Analysis of TRLX RLHF Transformers with TransformerLens

Researchers have demonstrated a method for training and analyzing language models using Reinforcement Learning from Human Feedback (RLHF). The process involves using the TRLX library for RLHF fine-tuning and TransformerLens for mechanistic interpretability. This approach was used to fine-tune a GPT-2 model to generate negatively biased movie reviews and then analyze the model to identify specific network regions responsible for this behavior. AI
RESEARCH · Hugging Face Blog English(EN) · 39mo

Fast Inference on Large Language Models: BLOOMZ on Habana Gaudi2 Accelerator

Hugging Face has released a new guide detailing how to achieve fast inference for large language models like BLOOMZ using Habana Gaudi2 accelerators. The guide provides practical steps and optimizations for developers looking to leverage this hardware for efficient LLM deployment. This collaboration aims to make powerful AI models more accessible and performant on specialized hardware. AI
RESEARCH · METR (Model Evaluation & Threat Research) English(EN) · 39mo · [3 sources]

ARC Evals is now METR

The Alignment Research Center's (ARC) evaluation team has officially spun off to form a new independent nonprofit organization named METR (Model Evaluation & Threat Research). METR will continue its work on evaluating frontier AI systems, focusing on their autonomous capabilities and potential threats, including AI self-improvement and evasion of oversight. The organization, led by Beth Barnes, has previously partnered with leading AI labs like OpenAI and Anthropic for evaluations and aims to develop rigorous testing methodologies to ensure AI safety before widespread deployment. AI
RESEARCH · OpenAI News English(EN) · 39mo

Preserving languages for the future

Iceland has partnered with OpenAI to leverage GPT-4 for the preservation of the Icelandic language, which is at risk of decline due to digitalization. A team of 40 volunteers is using Reinforcement Learning from Human Feedback (RLHF) to train GPT-4 on proper Icelandic grammar and cultural nuances. This initiative aims not only to safeguard Icelandic but also to create a model for preserving other low-resource languages globally, preventing an "AI divide." AI
RESEARCH · Hugging Face Blog English(EN) · 39mo

New ViT and ALIGN Models From Kakao Brain

Kakao Brain has released two new models, ViT and ALIGN, available on Hugging Face. The Vision Transformer (ViT) model is designed for image recognition tasks, while the ALIGN model focuses on image-text matching. These releases aim to advance research and development in computer vision and multimodal AI. AI
RESEARCH · Hugging Face Blog English(EN) · 39mo · [2 sources]

Train your ControlNet with diffusers

Hugging Face has released updated documentation and guides for training ControlNet models using their diffusers library. These resources aim to simplify the process for developers and researchers looking to fine-tune or create their own ControlNet models for image generation tasks. The guides provide practical steps and code examples to leverage the diffusers library effectively. AI
RESEARCH · Hugging Face Blog English(EN) · 39mo

Ethical Guidelines for developing the Diffusers library

Hugging Face has released ethical guidelines for the development and use of its Diffusers library, a popular open-source tool for creating diffusion models. These guidelines aim to promote responsible AI development by addressing potential harms associated with generative image models. The company encourages developers to consider the societal impact of their creations and to implement safeguards against misuse. AI
RESEARCH · Hugging Face Blog English(EN) · 40mo

Zero-shot image-to-text generation with BLIP-2

Hugging Face has released BLIP-2, a novel approach to zero-shot image-to-text generation. This model leverages pre-trained language models and vision transformers to achieve impressive performance without task-specific fine-tuning. BLIP-2 demonstrates strong capabilities in image captioning and visual question answering, setting a new standard for efficient and effective visual understanding. AI
RESEARCH · Hugging Face Blog English(EN) · 40mo

Parameter-Efficient Fine-Tuning using 🤗 PEFT

Hugging Face has released a new library called PEFT (Parameter-Efficient Fine-Tuning) to simplify the process of adapting large language models. This library offers several efficient fine-tuning techniques, such as LoRA, Prefix Tuning, and P-Tuning, which allow users to modify models with significantly fewer trainable parameters. By reducing computational costs and memory requirements, PEFT aims to make advanced LLM customization more accessible to a wider range of researchers and developers. AI
RESEARCH · Hugging Face Blog English(EN) · 40mo

Speech Synthesis, Recognition, and More With SpeechT5

Hugging Face has released SpeechT5, a versatile model for various speech tasks. It can perform speech recognition, synthesis, and speaker identification. The model is built on a T5 architecture and offers strong performance across these different applications. AI
RESEARCH · Hugging Face Blog English(EN) · 41mo

Universal Image Segmentation with Mask2Former and OneFormer

Hugging Face has released Mask2Former and OneFormer, advanced models for universal image segmentation. These models offer a unified approach to various segmentation tasks, including semantic, instance, and panoptic segmentation. Their architecture allows for improved performance and efficiency across a range of computer vision applications. AI
RESEARCH · Hugging Face Blog English(EN) · 41mo

Welcome PaddlePaddle to the Hugging Face Hub

Baidu's PaddlePaddle deep learning framework is now available on the Hugging Face Hub. This integration allows developers to access and utilize PaddlePaddle models alongside other popular frameworks within the Hugging Face ecosystem. The move aims to broaden the reach of PaddlePaddle and foster greater collaboration within the AI development community. AI
RESEARCH · Hugging Face Blog English(EN) · 41mo · [2 sources]

Accelerating PyTorch Transformers with Intel Sapphire Rapids - part 2

Hugging Face has released a two-part blog series detailing how to accelerate PyTorch Transformer models using Intel's Sapphire Rapids CPUs. The posts provide practical guidance and optimizations for leveraging these processors for efficient AI inference. This collaboration aims to improve performance and accessibility for running large language models on widely available hardware. AI
RESEARCH · Hugging Face Blog English(EN) · 42mo

Zero-shot image segmentation with CLIPSeg

Researchers have introduced CLIPSeg, a novel zero-shot image segmentation model that leverages the power of CLIP. This approach allows for flexible and intuitive image segmentation by enabling users to specify desired objects using natural language prompts. CLIPSeg demonstrates strong performance across various segmentation tasks without requiring task-specific training data. AI
RESEARCH · OpenAI News English(EN) · 42mo

Point-E: A system for generating 3D point clouds from complex prompts

OpenAI has introduced Point-E, a new system capable of generating 3D point clouds from text prompts significantly faster than previous methods. Unlike other approaches that take hours, Point-E can produce a 3D model in just one to two minutes using a single GPU. The system first creates a synthetic image from the text prompt using a diffusion model, then generates the 3D point cloud based on that image with a second diffusion model. While the quality may not yet match the absolute state-of-the-art, its speed offers a practical advantage for certain applications, and OpenAI has released the pre-trained models. AI
RESEARCH · Hugging Face Blog English(EN) · 42mo

Faster Training and Inference: Habana Gaudi®2 vs Nvidia A100 80GB

Habana Gaudi2 processors demonstrate competitive performance against Nvidia's A100 GPUs for large language model training and inference tasks. Benchmarks show Gaudi2 achieving faster training times and lower inference latency on specific workloads, particularly for models like Llama 2 and Falcon. This suggests Gaudi2 as a viable alternative for AI infrastructure, offering potential cost and performance benefits. AI
RESEARCH · Practical AI English(EN) · 42mo

SOTA machine translation at Unbabel

Unbabel researchers discussed state-of-the-art machine translation at EMNLP 2022. They highlighted innovations in quality estimation, including their COMET framework. COMET is designed for training multilingual machine translation evaluation models. AI
RESEARCH · OpenAI News English(EN) · 42mo

Discovering the minutiae of backend systems

OpenAI is highlighting the work of its backend engineering team, focusing on the infrastructure that supports large-scale AI model training. The team addresses the complexities of running exploratory AI workflows on massive supercomputing clusters, aiming to preempt research needs and resolve performance bottlenecks. They encounter unique hardware and software challenges due to the sheer scale of OpenAI's operations, often pushing the boundaries of what third-party vendors have previously experienced. AI
RESEARCH · Hugging Face Blog English(EN) · 43mo · [2 sources]

Multivariate Probabilistic Time Series Forecasting with Informer

Hugging Face has released new resources for time series forecasting using their Transformers library. The first resource details the Informer model, which is designed for multivariate probabilistic time series forecasting. The second resource provides a broader guide on leveraging Transformers for probabilistic time series forecasting tasks. AI
RESEARCH · Hugging Face Blog English(EN) · 43mo · [5 sources]

🧨 Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e

Hugging Face has released updates to accelerate Stable Diffusion XL image generation across various platforms. Optimized inference is now available for Cloud TPU v5e using JAX, significantly speeding up processing on Google Cloud. Additionally, advanced Core ML quantization techniques are being implemented for enhanced performance on Apple devices, including iPhones, iPads, and Macs. AI
RESEARCH · Hugging Face Blog Deutsch(DE) · 43mo

VQ-Diffusion

Hugging Face has released VQ-Diffusion, a novel text-to-image generation model that utilizes a Vector Quantized (VQ) Variational Autoencoder (VAE) for improved efficiency and quality. This approach allows for faster training and inference compared to traditional diffusion models. The model is available on Hugging Face, enabling researchers and developers to experiment with and build upon its capabilities. AI
RESEARCH · arXiv cs.LG English(EN) · 43mo · [113 sources]

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Researchers are developing new methods to evaluate and enhance Large Language Models (LLMs). Apple's research proposes a benchmark to test LLMs' understanding of context, finding that quantized models and pre-trained dense models struggle with nuanced contextual features. Meanwhile, a new technique called Retrieval-Augmented Linguistic Calibration (RALC) improves how LLMs express confidence in their answers, enhancing faithfulness and calibration. Other research explores LLMs for clinical action extraction, demonstrating comparable performance to supervised models but highlighting limitations in clinical reasoning, and introduces Listwise Policy Optimization for more stable and diverse LLM training. AI

IMPACT New benchmarks and calibration techniques aim to improve LLM reliability and reasoning, potentially impacting their application in critical domains like healthcare and scientific discovery.