Brief

last 24h

[50/8385] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · Hugging Face Blog English(EN) · 40mo

Parameter-Efficient Fine-Tuning using 🤗 PEFT

Hugging Face has released a new library called PEFT (Parameter-Efficient Fine-Tuning) to simplify the process of adapting large language models. This library offers several efficient fine-tuning techniques, such as LoRA, Prefix Tuning, and P-Tuning, which allow users to modify models with significantly fewer trainable parameters. By reducing computational costs and memory requirements, PEFT aims to make advanced LLM customization more accessible to a wider range of researchers and developers. AI
RESEARCH · Hugging Face Blog English(EN) · 40mo

Speech Synthesis, Recognition, and More With SpeechT5

Hugging Face has released SpeechT5, a versatile model for various speech tasks. It can perform speech recognition, synthesis, and speaker identification. The model is built on a T5 architecture and offers strong performance across these different applications. AI
FRONTIER RELEASE · Hugging Face Blog English(EN) · 40mo · [624 sources]

A Dive into Vision-Language Models

Alibaba's Qwen team has released Qwen3.7-Plus, a new multimodal agent model designed to integrate vision and language capabilities for versatile agentic tasks. This release is part of a broader trend highlighted by Hugging Face, which features multiple new vision-language models and techniques. The platform showcases advancements like Google's PaliGemma 2, Microsoft's Florence-2, and Meta's Idefics2, alongside methods for aligning and optimizing these models. AI

IMPACT Alibaba's Qwen3.7-Plus release advances multimodal agent capabilities, while Hugging Face's featured models and techniques highlight broader progress in vision-language understanding and alignment.
- SmolVLM
- Hugging Face
- PaliGemma 2
- Google
- Florence-2
- Microsoft
- Idefics2
- PaliGemma
- SigLIP 2
- Qwen3.7-Plus
- Alibaba
- Meta
TOOL · OpenAI News English(EN) · 40mo

New AI classifier for indicating AI-written text

OpenAI has discontinued its AI text classifier due to low accuracy, with the tool only correctly identifying 26% of AI-generated text and mislabeling human text 9% of the time. The company stated that the classifier was unreliable, especially for shorter texts and non-English content, and could be evaded by editing. OpenAI is researching more effective methods for detecting AI-generated content and aims to develop tools for identifying AI-generated audio and visual media. AI
RESEARCH · Hugging Face Blog English(EN) · 41mo

Universal Image Segmentation with Mask2Former and OneFormer

Hugging Face has released Mask2Former and OneFormer, advanced models for universal image segmentation. These models offer a unified approach to various segmentation tasks, including semantic, instance, and panoptic segmentation. Their architecture allows for improved performance and efficiency across a range of computer vision applications. AI
RESEARCH · Hugging Face Blog English(EN) · 41mo

Welcome PaddlePaddle to the Hugging Face Hub

Baidu's PaddlePaddle deep learning framework is now available on the Hugging Face Hub. This integration allows developers to access and utilize PaddlePaddle models alongside other popular frameworks within the Hugging Face ecosystem. The move aims to broaden the reach of PaddlePaddle and foster greater collaboration within the AI development community. AI
TOOL · OpenAI News English(EN) · 41mo

Delivering nuanced insights from customer feedback

Yabble, a customer feedback analysis platform, has integrated OpenAI's GPT-3 to significantly accelerate its insight generation process. This integration allows Yabble to transform complex, unstructured customer data into actionable themes and subthemes in minutes, a task that previously took days or weeks. The enhanced capabilities enable Yabble Query to handle more sophisticated user questions, providing more relevant and insightful responses to inform business strategies. AI
TOOL · OpenAI News Italiano(IT) · 41mo

Fine-tuning GPT-3 to scale video creation

Waymark, a video creation platform, has enhanced its scriptwriting capabilities by fine-tuning OpenAI's GPT-3 models. This integration allows users to generate original and customized video scripts in seconds, significantly reducing the time and effort previously spent on editing generic copy. The move transforms Waymark into a "natural-language video creation platform," making video advertising more accessible and efficient for businesses. AI
RESEARCH · Hugging Face Blog English(EN) · 41mo · [2 sources]

Accelerating PyTorch Transformers with Intel Sapphire Rapids - part 2

Hugging Face has released a two-part blog series detailing how to accelerate PyTorch Transformer models using Intel's Sapphire Rapids CPUs. The posts provide practical guidance and optimizations for leveraging these processors for efficient AI inference. This collaboration aims to improve performance and accessibility for running large language models on widely available hardware. AI
TOOL · OpenAI News English(EN) · 41mo

Creating next-gen characters

OpenAI is developing advanced AI characters using its GPT-3 model. This initiative aims to create more sophisticated and interactive AI personas for various applications. The technology leverages GPT-3's capabilities to generate dynamic and engaging character behaviors. AI
RESEARCH · Hugging Face Blog English(EN) · 42mo

Zero-shot image segmentation with CLIPSeg

Researchers have introduced CLIPSeg, a novel zero-shot image segmentation model that leverages the power of CLIP. This approach allows for flexible and intuitive image segmentation by enabling users to specify desired objects using natural language prompts. CLIPSeg demonstrates strong performance across various segmentation tasks without requiring task-specific training data. AI
RESEARCH · OpenAI News English(EN) · 42mo

Point-E: A system for generating 3D point clouds from complex prompts

OpenAI has introduced Point-E, a new system capable of generating 3D point clouds from text prompts significantly faster than previous methods. Unlike other approaches that take hours, Point-E can produce a 3D model in just one to two minutes using a single GPU. The system first creates a synthetic image from the text prompt using a diffusion model, then generates the 3D point cloud based on that image with a second diffusion model. While the quality may not yet match the absolute state-of-the-art, its speed offers a practical advantage for certain applications, and OpenAI has released the pre-trained models. AI
RESEARCH · Hugging Face Blog English(EN) · 42mo

Faster Training and Inference: Habana Gaudi®2 vs Nvidia A100 80GB

Habana Gaudi2 processors demonstrate competitive performance against Nvidia's A100 GPUs for large language model training and inference tasks. Benchmarks show Gaudi2 achieving faster training times and lower inference latency on specific workloads, particularly for models like Llama 2 and Falcon. This suggests Gaudi2 as a viable alternative for AI infrastructure, offering potential cost and performance benefits. AI
RESEARCH · Practical AI English(EN) · 42mo

SOTA machine translation at Unbabel

Unbabel researchers discussed state-of-the-art machine translation at EMNLP 2022. They highlighted innovations in quality estimation, including their COMET framework. COMET is designed for training multilingual machine translation evaluation models. AI
TOOL · Hugging Face Blog English(EN) · 42mo

From GPT2 to Stable Diffusion: Hugging Face arrives to the Elixir community

Hugging Face has integrated its AI models with the Elixir programming language through the Bumblebee library. This integration allows developers to easily run various AI models, including text generation models like GPT-2 and image generation models like Stable Diffusion, directly within Elixir applications. The goal is to make advanced AI capabilities more accessible to the Elixir community, enabling them to build more sophisticated applications. AI
RESEARCH · OpenAI News English(EN) · 42mo

Discovering the minutiae of backend systems

OpenAI is highlighting the work of its backend engineering team, focusing on the infrastructure that supports large-scale AI model training. The team addresses the complexities of running exploratory AI workflows on massive supercomputing clusters, aiming to preempt research needs and resolve performance bottlenecks. They encounter unique hardware and software challenges due to the sheer scale of OpenAI's operations, often pushing the boundaries of what third-party vendors have previously experienced. AI
RESEARCH · Hugging Face Blog English(EN) · 43mo · [2 sources]

Multivariate Probabilistic Time Series Forecasting with Informer

Hugging Face has released new resources for time series forecasting using their Transformers library. The first resource details the Informer model, which is designed for multivariate probabilistic time series forecasting. The second resource provides a broader guide on leveraging Transformers for probabilistic time series forecasting tasks. AI
RESEARCH · Hugging Face Blog English(EN) · 43mo · [5 sources]

🧨 Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e

Hugging Face has released updates to accelerate Stable Diffusion XL image generation across various platforms. Optimized inference is now available for Cloud TPU v5e using JAX, significantly speeding up processing on Google Cloud. Additionally, advanced Core ML quantization techniques are being implemented for enhanced performance on Apple devices, including iPhones, iPads, and Macs. AI
TOOL · OpenAI News English(EN) · 43mo · [67 sources]

Making ChatGPT better for clinicians

OpenAI is making ChatGPT more accessible and useful for various professional teams by offering it for free to verified U.S. clinicians and expanding its capabilities for customer success, operations, marketing, sales, and research. These updates focus on leveraging ChatGPT to streamline workflows, improve communication, synthesize information, and generate structured outputs, ultimately aiming to increase efficiency and effectiveness in these professional domains. The enhancements include features for data analysis, advanced web searching with citations, and voice interaction, all designed to help users move from raw information to actionable insights and decisions more rapidly. AI

IMPACT Expands ChatGPT's utility for specialized professional workflows, potentially increasing adoption across diverse business functions.
RESEARCH · Hugging Face Blog Deutsch(DE) · 43mo

VQ-Diffusion

Hugging Face has released VQ-Diffusion, a novel text-to-image generation model that utilizes a Vector Quantized (VQ) Variational Autoencoder (VAE) for improved efficiency and quality. This approach allows for faster training and inference compared to traditional diffusion models. The model is available on Hugging Face, enabling researchers and developers to experiment with and build upon its capabilities. AI
RESEARCH · arXiv cs.LG English(EN) · 43mo · [113 sources]

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Researchers are developing new methods to evaluate and enhance Large Language Models (LLMs). Apple's research proposes a benchmark to test LLMs' understanding of context, finding that quantized models and pre-trained dense models struggle with nuanced contextual features. Meanwhile, a new technique called Retrieval-Augmented Linguistic Calibration (RALC) improves how LLMs express confidence in their answers, enhancing faithfulness and calibration. Other research explores LLMs for clinical action extraction, demonstrating comparable performance to supervised models but highlighting limitations in clinical reasoning, and introduces Listwise Policy Optimization for more stable and diverse LLM training. AI

IMPACT New benchmarks and calibration techniques aim to improve LLM reliability and reasoning, potentially impacting their application in critical domains like healthcare and scientific discovery.
RESEARCH · Hugging Face Blog Deutsch(DE) · 43mo

Diffusion Models Live Event

Hugging Face is hosting a live event focused on diffusion models, a type of generative AI used for creating images and other data. The event will feature discussions and demonstrations related to the latest advancements and applications of these models. Attendees can expect insights from experts in the field. AI
TOOL · Hugging Face Blog Română(RO) · 43mo

Accelerating Document AI

Hugging Face has released a new suite of tools and models designed to accelerate the development and deployment of Document AI applications. This initiative aims to simplify the process of building systems that can understand, process, and extract information from various document types. The platform offers pre-trained models and a streamlined workflow for developers to integrate these capabilities into their own projects. AI
RESEARCH · Practical AI English(EN) · 43mo

The practicalities of releasing models

This podcast episode delves into the practical challenges and considerations involved in releasing AI models. The discussion specifically touches upon the Open RAIL-M license and the process of making models available on platforms like Hugging Face. Additional topics explored include graph neural networks, message passing techniques, and the fine-tuning of synthesized voices. AI
TOOL · Replit blog English(EN) · 44mo

Meet Replit Ghostwriter, your partner in code

Replit has launched Ghostwriter, an AI-powered coding assistant, to the public. The tool offers features such as in-line code suggestions, code explanation, code transformation, and code generation, aiming to make programming faster and more accessible. Ghostwriter is available on a subscription basis, with a limited trial offered to early users. AI

IMPACT Enhances developer productivity by automating coding tasks and providing assistance across various stages of software development.
- Giuseppe
- Krish
- Arnav
- Muhammad
- Ted
- Devin
- Søren
- Replit
- Ghostwriter
- Alex
RESEARCH · Hugging Face Blog English(EN) · 44mo

🧨 Stable Diffusion in JAX / Flax !

Hugging Face has released a JAX/Flax implementation of Stable Diffusion, a popular text-to-image generation model. This new version allows researchers and developers to leverage the performance benefits of JAX for running and fine-tuning Stable Diffusion. The release includes pre-trained weights and examples to facilitate experimentation and integration into existing workflows. AI
RESEARCH · Practical AI English(EN) · 44mo

What's up, DocQuery?

Impira has released an open-source ML model called DocQuery, designed to help users query semi-structured and unstructured documents using LLMs. The model can process various document types, including invoices and contracts, enabling users to ask questions and extract information more efficiently. This tool aims to provide practical AI solutions for managing and understanding document-based data. AI
TOOL · OpenAI News English(EN) · 45mo

DALL·E now available without waitlist

OpenAI has removed the waitlist for its DALL·E beta, allowing immediate access for all users. The image generation model is currently used by over 1.5 million people, who create more than 2 million images daily. This expansion follows iterative improvements to safety systems and the introduction of features like Outpainting, with an API planned for broader developer access soon. AI
TOOL · Hugging Face Blog English(EN) · 45mo

Image Classification with AutoTrain

Hugging Face has introduced AutoTrain, a new feature designed to simplify the process of training image classification models. This tool automates many of the complex steps involved in model development, making it more accessible. AutoTrain aims to empower users to build and deploy their own image recognition systems with greater ease. AI
TOOL · Hugging Face Blog English(EN) · 45mo

How 🤗 Accelerate runs very large models thanks to PyTorch

Hugging Face's Accelerate library now supports running very large language models by leveraging PyTorch's fully sharded data parallelism (FSDP). This integration allows for efficient distribution of model parameters, gradients, and optimizer states across multiple GPUs, significantly reducing memory requirements per device. The update enables users to train and infer with models that would otherwise be too large to fit into the memory of a single GPU, making advanced AI more accessible. AI
RESEARCH · OpenAI News English(EN) · 45mo

Introducing Whisper

OpenAI has released Whisper, an automatic speech recognition system trained on a massive 680,000 hours of diverse, multilingual data. This extensive training enables Whisper to perform robustly across various accents, background noises, and technical language, while also supporting transcription and translation into English. The system utilizes a Transformer-based encoder-decoder architecture and is being open-sourced to foster application development and further research in speech processing. AI
RESEARCH · Hugging Face Blog English(EN) · 45mo · [2 sources]

Optimization story: Bloom inference

Hugging Face has released new optimization techniques for the BLOOM language model, significantly improving its inference speed. These advancements leverage DeepSpeed and Hugging Face's Accelerate library, enabling faster and more efficient deployment of BLOOM. The optimizations are detailed in recent blog posts, offering practical guidance for developers working with large language models. AI
RESEARCH · Hugging Face Blog English(EN) · 45mo

What's new in Diffusers? 🎨

Hugging Face has released version 0.29.0 of its Diffusers library, introducing significant enhancements for diffusion models. Key updates include improved support for latent consistency models (LCMs) and LoRA, alongside performance optimizations for faster inference. This release also brings new features for handling model conditioning and expands the library's capabilities for advanced image generation tasks. AI
RESEARCH · Hugging Face Blog English(EN) · 45mo

Train your first Decision Transformer

Hugging Face has released a guide on how to train Decision Transformers, a type of model that frames reinforcement learning as a sequence modeling problem. The blog post details the process of training these transformers, which can be used for various decision-making tasks. It aims to make this advanced technique more accessible to developers. AI
TOOL · Replit blog English(EN) · 45mo

Ghostwriter AI & Complete Code Beta

Replit has launched Ghostwriter, an AI-powered coding assistant integrated directly into its IDE. Ghostwriter offers features like real-time code completion, code generation, transformation, and explanation, aiming to significantly enhance developer productivity. The platform emphasizes its speed, claiming to be at least twice as fast as GitHub Copilot and achieving median response times under 400ms through optimizations like FasterTransformer and knowledge distillation from open-source models. AI

IMPACT Accelerates developer workflows and potentially lowers the barrier to entry for coding.
TOOL · OpenAI News English(EN) · 46mo

DALL·E: Introducing outpainting

OpenAI has launched a new feature for its DALL·E image generation system called Outpainting. This tool allows users to extend existing images beyond their original boundaries, maintaining the original style and context. Outpainting can create images of any size and aspect ratio by interpreting natural language descriptions. The feature is now available to all DALL·E users on desktop. AI
SIGNIFICANT · OpenAI News English(EN) · 46mo · [3736 sources]

Our approach to alignment research

OpenAI has announced a partnership with Apple to integrate ChatGPT into iOS, iPadOS, and macOS, enhancing Siri and system-wide writing tools with GPT-4o capabilities. Google DeepMind has published research on scaling AI agent systems, identifying that multi-agent coordination improves parallelizable tasks but can degrade sequential ones, and has developed a predictive model for optimal agent architectures. Additionally, OpenAI has released resources on prompting fundamentals and shared insights from Netomi on scaling agentic systems in enterprise environments, highlighting the use of GPT-4.1 and GPT-5.2 for complex workflows. AI

IMPACT Partnership integrates advanced AI into consumer devices, while research offers principles for scaling complex AI agent systems.
- Sundar Pichai
- OpenAI
- Mythos Preview
- Anthropic
- CodeMender
- Google
- Koray Kavukcuoglu
- Apple
- ChatGPT
- Siri
- GPT-4o
- Google DeepMind
- AI agent systems
- GPT-4.1
- GPT-5.2
- Netomi
RESEARCH · Hugging Face Blog English(EN) · 46mo · [6 sources]

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

Hugging Face has released Stable Diffusion 3.5 Large, an updated version of its text-to-image generation model. This release is part of a broader effort to introduce modularity and efficiency to diffusion models through the Diffusers library. The library now supports composable building blocks for diffusion pipelines, memory-efficient training with technologies like Quanto, and streamlined workflows for techniques such as Dreambooth. AI
RESEARCH · Hugging Face Blog English(EN) · 46mo

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Hugging Face has integrated the bitsandbytes library to enable efficient 8-bit matrix multiplication for large transformer models. This optimization significantly reduces memory usage, allowing for the training and inference of bigger models on existing hardware. The integration aims to make advanced AI model development more accessible by lowering computational barriers. AI
RESEARCH · Practical AI English(EN) · 46mo

CMU's AI pilot lands in the news 🗞

Carnegie Mellon University has developed an AI pilot capable of navigating complex and crowded airspace. This advancement was highlighted in a recent discussion covering various AI topics, including infrastructure tools like Baseten's Truss and advancements in transformer models. The AI's ability to manage aerial traffic was a notable point of interest. AI
RESEARCH · OpenAI News English(EN) · 46mo · [2 sources]

Upgrading the Moderation API with our new multimodal moderation model

OpenAI has released an upgraded Moderation API, powered by a new multimodal model based on GPT-4o. This enhanced model offers improved accuracy in detecting harmful text and images, particularly in non-English languages, and supports new categories like illicit activities. The update aims to provide developers with more robust tools for content safety, enabling them to build more secure AI applications and products. AI
RESEARCH · Hugging Face Blog English(EN) · 47mo

Nyströmformer: Approximating self-attention in linear time and memory via the Nyström method

Researchers have developed Nyströmformer, a novel approach to approximating self-attention mechanisms in transformer models. This method utilizes the Nyström method to achieve linear time and memory complexity, a significant improvement over the quadratic complexity of standard self-attention. The innovation holds promise for enabling transformers to handle much longer sequences more efficiently. AI
FRONTIER RELEASE · Hugging Face Blog Português(PT) · 47mo · [3 sources]

A Complete Guide to Audio Datasets

OpenAI has released new, advanced audio models through its API, enhancing capabilities for voice agents. The updated speech-to-text models, including gpt-4o-transcribe and gpt-4o-mini-transcribe, offer improved accuracy and reliability, particularly in challenging audio conditions. Additionally, a new text-to-speech model, gpt-4o-mini-tts, allows developers to customize vocal delivery for more expressive and tailored applications. AI
RESEARCH · Hugging Face Blog English(EN) · 47mo

Faster Text Generation with TensorFlow and XLA

Hugging Face has integrated TensorFlow and XLA to significantly accelerate text generation. This optimization allows for faster inference speeds, making it more efficient to deploy large language models. The improvements are particularly noticeable for users leveraging TensorFlow within the Hugging Face ecosystem. AI
RESEARCH · OpenAI News English(EN) · 47mo

A hazard analysis framework for code synthesis large language models

OpenAI has developed a hazard analysis framework to identify potential risks associated with large language models that generate code, such as their model Codex. This framework aims to uncover technical, social, political, and economic safety concerns that may arise from the deployment of these powerful code-synthesis tools. The analysis is supported by a new evaluation system that assesses the models' ability to understand and execute complex prompts compared to human capabilities. AI
TOOL · OpenAI News English(EN) · 47mo · [2 sources]

DALL·E API now available in public beta

OpenAI has launched a public beta for its DALL·E API, allowing developers to integrate the image generation technology into their applications. This move follows the beta release of DALL·E itself, which is inviting one million users from its waitlist. The API aims to provide developers with the same capabilities used by over 3 million people, enabling the creation of diverse images from natural language descriptions. OpenAI has incorporated safety measures learned from its broader DALL·E deployment, including filters for harmful content and bias reduction techniques, to support responsible integration. AI
RESEARCH · Practical AI English(EN) · 47mo

DALL-E is one giant leap for raccoons! 🔭

OpenAI has released DALL-E 2, a new model capable of generating detailed images from text descriptions. While some in the AI community speculate about models approaching sentience, the hosts of this podcast dismiss such notions. They highlight DALL-E 2's impressive capabilities, particularly its ability to create imaginative visuals like raccoons in space. AI
RESEARCH · OpenAI News English(EN) · 47mo

Reducing bias and improving safety in DALL·E 2

OpenAI has implemented a new system-level technique for DALL·E 2 to generate more diverse images of people when race or gender are not specified in prompts. This change, informed by user feedback during a research preview, has resulted in users being 12 times more likely to see diverse representations. Additionally, OpenAI has enhanced safety measures by rejecting realistic face uploads, limiting public figure likeness generation, and refining content filters and monitoring systems to prevent misuse and deceptive content. AI
RESEARCH · Hugging Face Blog English(EN) · 47mo

How to train your model dynamically using adversarial data

Hugging Face has released a guide on dynamically training models using adversarial data. This method involves generating adversarial examples during the training process to improve model robustness. The guide uses the MNIST dataset as a practical example to demonstrate the techniques involved. AI
RESEARCH · Hugging Face Blog English(EN) · 47mo

The Technology Behind BLOOM Training

BLOOM, an open-access large language model, was trained using a combination of Megatron-LM and DeepSpeed. This approach allowed for efficient training across multiple GPUs by distributing the model and data. The training process involved careful management of hardware resources and software configurations to achieve optimal performance. AI