Brief

last 24h

[50/8400] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · Hugging Face Blog English(EN) · 30mo · [31 sources]

MachinaCheck: Building a Multi-Agent CNC Manufacturability System on AMD MI300X

A new system called MachinaCheck has been developed to automate the manufacturability assessment of CNC parts, reducing the process from an hour to 30 seconds. This multi-agent AI system leverages the Qwen 2.5 7B Instruct model running on AMD MI300X hardware to ensure that sensitive customer design data remains on-premise, addressing critical privacy concerns in manufacturing. The system parses STEP files to extract geometric features and then uses the LLM to determine necessary CNC operations and tools, providing a comprehensive report. AI

IMPACT Enables on-premise AI for sensitive manufacturing data, potentially accelerating adoption of AI in industries with strict IP requirements.
- LangChain
- MachinaCheck
- Qwen 2.5 7B Instruct
- AMD MI300X
- AMD
- Anthropic
- OpenAI
- FastAPI
- Hugging Face
RESEARCH · Smol AINews English(EN) · 30mo

12/20/2023: Project Obsidian - Multimodal Mistral 7B from Nous

Nous Research has released Project Obsidian, a multimodal version of the Mistral 7B language model. This new model is capable of processing and generating both text and images. The release aims to provide a more versatile and accessible tool for multimodal AI development. AI
RESEARCH · Hugging Face Blog English(EN) · 30mo · [2 sources]

Blazingly fast whisper transcriptions with Inference Endpoints

Hugging Face has released updates to accelerate Whisper, their open-source speech-to-text model. By leveraging speculative decoding, they have achieved up to a 2x speed increase in inference times. These performance gains are being made available through Hugging Face's Inference Endpoints service, allowing developers to deploy faster transcription capabilities. AI
RESEARCH · Smol AINews English(EN) · 30mo

12/18/2023: Gaslighting Mistral for fun and profit

Mistral AI has released its latest open-source model, Mixtral 8x7B. This model utilizes a sparse mixture-of-experts (SMoE) architecture, which allows it to achieve performance comparable to larger dense models while using significantly fewer computational resources during inference. Mixtral 8x7B has demonstrated strong performance on various benchmarks, outperforming other open-source models and even rivaling some proprietary models like GPT-3.5. AI
TOOL · OpenAI News English(EN) · 30mo

Increasing accuracy of pediatric visit notes

Summer Health has partnered with OpenAI to leverage GPT-4 for generating pediatric medical visit notes, significantly improving efficiency and parent satisfaction. This AI-powered solution reduces the time pediatricians spend on administrative tasks by fivefold, from ten minutes to two minutes per note, and decreases note completion delays by 400%. Parents have reported receiving clearer, more understandable notes, leading to better informed health decisions. AI
RESEARCH · Smol AINews Deutsch(DE) · 30mo

12/13/2023 SOLAR10.7B upstages Mistral7B?

The SOLAR-10.7B model has been released, demonstrating performance that rivals or surpasses that of Mistral-7B on various benchmarks. This open-source model was developed by a team of researchers, and its release is expected to provide a strong alternative for developers and researchers in the AI community. The model's architecture and training methodology are detailed in accompanying research, highlighting its potential for further advancements in language model capabilities. AI
TOOL · Smol AINews English(EN) · 30mo

12/12/2023: Towards LangChain 0.1

LangChain is nearing its 0.1 release, indicating a significant milestone for the popular framework used in developing applications powered by large language models. This upcoming release suggests a move towards greater stability and feature completeness, essential for production environments. The development signifies the maturation of tools supporting the burgeoning AI application ecosystem. AI
RESEARCH · Smol AINews English(EN) · 30mo · [2 sources]

12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)

Mistral AI's Mixtral model has demonstrated strong performance, surpassing Google's Gemini Pro and matching OpenAI's GPT-3.5 on certain benchmarks. Earlier reports indicated that Mixtral also outperformed GPT-3.5 and Meta's Llama 2 70B model. These results highlight the growing capabilities of open-source models in competing with leading proprietary AI systems. AI
RESEARCH · Hugging Face Blog English(EN) · 30mo

Mixture of Experts Explained

Hugging Face has published a detailed explanation of Mixture of Experts (MoE) models, a technique that allows for more efficient scaling of large language models. MoE architectures activate only specific parts of the neural network for each input, leading to faster inference and reduced computational costs compared to dense models of similar size. This approach is becoming increasingly popular for training state-of-the-art models. AI
RESEARCH · Smol AINews English(EN) · 30mo

12/9/2023: The Mixtral Rush

Mistral AI has released Mixtral 8x7B, a sparse mixture-of-experts (SMoE) large language model. This model demonstrates strong performance, outperforming Llama 2 70B on many benchmarks while using significantly less compute during inference. The model is available under the Apache 2.0 license, allowing for commercial use. AI
RESEARCH · Smol AINews (CA) · 30mo

12/8/2023 - Mamba vs Mistral vs Hyena

The Mamba model has emerged as a strong contender against established architectures like Mistral and Hyena, particularly in its ability to handle long sequences efficiently. This new architecture utilizes a selective state space model, which allows for faster inference and training compared to traditional transformers. Its performance suggests a potential shift in how large language models are designed and optimized for speed and scalability. AI
RESEARCH · Latent Space Podcast English(EN) · 30mo

The Busy Person's Intro to Finetuning & Open Source AI - Wing Lian, Axolotl

Wing Lian, the maintainer of the Axolotl library, discussed the growing ecosystem of fine-tuned open-source AI models. Axolotl has become a popular tool for customizing models like Llama 2 and Mistral 7B, enabling benefits such as enhanced privacy, specific performance improvements, and reduced inference costs. The library supports various fine-tuning techniques and prompt formats, catering to a wide range of model architectures and communities. AI
COMMENTARY · Smol AINews English(EN) · 30mo

Is Google's Gemini... legit?

The article questions the legitimacy and capabilities of Google's Gemini AI model, suggesting it may not be as advanced as claimed. It points to potential issues and limitations that raise doubts about its performance and readiness. The piece implies that the public perception of Gemini might be inflated compared to its actual functionality. AI
RESEARCH · Hugging Face Blog English(EN) · 30mo

SetFitABSA: Few-Shot Aspect Based Sentiment Analysis using SetFit

Hugging Face has released SetFitABSA, a new framework for few-shot Aspect-Based Sentiment Analysis (ABSA). This approach leverages the SetFit model to achieve strong performance with minimal labeled data. The framework is designed to be efficient and adaptable for various ABSA tasks. AI
RESEARCH · Hugging Face Blog English(EN) · 30mo

Goodbye cold boot - how we made LoRA Inference 300% faster

Hugging Face has developed a new method to significantly speed up LoRA (Low-Rank Adaptation) inference, achieving a 300% performance increase. This optimization addresses the issue of slow cold boot times previously associated with dynamic loading of LoRA adapters. The new technique allows for faster loading and utilization of these adapters, improving the efficiency of fine-tuned models. AI
RESEARCH · Hugging Face Blog Nederlands(NL) · 30mo

Open LLM Leaderboard: DROP deep dive

Hugging Face has updated its Open LLM Leaderboard to incorporate a new evaluation metric called DROP (Discrete Reasoning Over Paragraphs). This addition aims to better assess the reasoning capabilities of large language models, particularly in tasks requiring multi-hop reasoning and understanding of complex textual information. The DROP metric is now a key component in ranking open-source models, providing a more nuanced view of their performance beyond traditional benchmarks. AI
TOOL · Replit blog English(EN) · 30mo

Replit + Weights & Biases: Building a RAG Bot

Weights & Biases has developed an AI-powered assistant called WandBot to help users navigate its documentation and code examples. This retrieval-augmented generation (RAG) bot utilizes OpenAI's GPT-4 for its intelligence, combined with Cohere embeddings and a FAISS vector store for efficient information retrieval. WandBot is integrated with platforms like Discord, Slack, and ChatGPT, and is hosted on Replit for seamless deployment and scalability. AI

IMPACT Enhances developer productivity by providing instant, context-aware support for AI tools and documentation.
- Weights & Biases
- WandBot
- OpenAI
- GPT-4
- Replit
- Cohere
- llama-index
- FAISS
- Discord
- Slack
- Zendesk
- ChatGPT
- Bharat Ramanathan
- Morgan McGuire
TOOL · Practical AI English(EN) · 31mo

Generating product imagery at Shopify

Shopify has developed an AI tool capable of generating product imagery, specifically by replacing background scenes. This innovation was showcased on a Hugging Face space, demonstrating its effectiveness. The development process focused on creating clever AI solutions without the need for extensive model training. AI
TOOL · Replit blog English(EN) · 31mo

Announcing Replit Core - The Essential Membership for Builders

Replit has launched Replit Core, a new membership plan designed to offer an integrated developer experience. The plan includes advanced AI coding assistance powered by GPT-4, an upgraded cloud development environment with enhanced compute resources and security features, and one-click deployments with on-demand scaling. Additionally, Replit Core provides priority support, access to community events, and partner perks such as a Perplexity Pro subscription and Neon PostgreSQL integration. AI

IMPACT Enhances developer productivity with integrated AI coding assistance and provides robust cloud infrastructure for building and deploying applications.
- Google Cloud
- Neon
- Replit Core
- GPT-4
- Perplexity
- Replit
RESEARCH · EleutherAI Blog English(EN) · 31mo

Extending the RoPE

EleutherAI has published a blog post detailing methods to extend the context length of Rotary Position Embeddings (RoPE), a technique crucial for modern language models. The post explains how RoPE enables attention scores to depend on the relative distance between tokens. It introduces Position Interpolation (PI) as an efficient fine-tuning method to adapt pre-trained models for longer sequences by scaling down position indices. AI
RESEARCH · Hugging Face Blog English(EN) · 31mo

SDXL in 4 steps with Latent Consistency LoRAs

Hugging Face has released a new technique called Latent Consistency LoRAs (LC-LoRAs) that significantly speeds up the image generation process for Stable Diffusion XL. This method allows users to generate high-quality images in as few as four steps, a dramatic reduction from the typical 20-50 steps. The LC-LoRAs are designed to be compatible with existing Stable Diffusion XL models and can be easily integrated into workflows, offering a substantial performance boost for creators. AI
TOOL · Hugging Face Blog English(EN) · 31mo

Make your llama generation time fly with AWS Inferentia2

Hugging Face has partnered with AWS to optimize Llama 2 model inference on AWS Inferentia2 chips. This collaboration enables significantly faster generation times for Llama 2 models, making them more efficient for deployment. The integration leverages AWS's specialized hardware to reduce latency and improve throughput for large language model applications. AI
RESEARCH · Hugging Face Blog English(EN) · 31mo

Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama 2, and Mistral for Disaster Tweets Analysis with Lora

Researchers explored the effectiveness of LoRA (Low-Rank Adaptation) in fine-tuning large language models for disaster tweet analysis. The study compared the performance of models like Roberta, Llama 2, and Mistral when adapted using LoRA. Results indicated that LoRA significantly improved the efficiency and performance of these models in classifying disaster-related tweets. AI
FRONTIER RELEASE · OpenAI News English(EN) · 31mo

New models and developer products announced at DevDay

OpenAI announced several updates at its DevDay event, including the new GPT-4 Turbo model with a 128K context window and knowledge up to April 2023, offered at a reduced price. The company also introduced an Assistants API to simplify the creation of AI-powered applications and enhanced multimodal capabilities with DALL-E 3 and vision support. These updates aim to provide developers with more powerful and cost-effective tools, with new features rolling out starting today. AI
- OpenAI
- GPT-3.5 Turbo
- ChatGPT
- DALL-E 3
- GPT-4
- Assistants API
- GPT-4 Turbo
- DevDay
RESEARCH · Latent Space Podcast English(EN) · 31mo

Beating GPT-4 with Open Source LLMs — with Michael Royzen of Phind

Phind has released a new open-source model that now ranks as the top model on the BigCode Leaderboard, surpassing GPT-4 in performance on certain benchmarks. This model, based on CodeLlama-34B and further fine-tuned on extensive code and reasoning data, boasts a significantly expanded context window and is notably faster than GPT-4. Phind's approach emphasizes both the quality of retrieved context and the accuracy of the generated code, aiming to provide developers with a comprehensive tool for technical questions and implementation. AI
TOOL · Practical AI English(EN) · 31mo

Self-hosting & scaling models

This podcast episode features Tuhin Srivastava from Baseten discussing the self-hosting and scaling of open-access AI models. The conversation delves into current trends in tooling and usage for these models, as well as common applications. The growth of generative AI and its impact on the ecosystem of self-hosted models was also a key topic. AI
TOOL · Hugging Face Blog English(EN) · 32mo

Personal Copilot: Train Your Own Coding Assistant

Hugging Face has released a guide on how to train a personalized coding assistant. This allows developers to create an AI model tailored to their specific coding style and project needs. The process involves fine-tuning existing large language models with personal code data. AI
SIGNIFICANT · OpenAI News English(EN) · 32mo

Frontier risk and preparedness

OpenAI has established a new Preparedness team, led by Aleksander Madry, to focus on the safety risks associated with highly capable AI systems, including potential catastrophic misuse. This team will integrate capability assessment, evaluations, and red teaming for future frontier models and AGI. OpenAI is also launching an AI Preparedness Challenge to identify novel catastrophic misuse risks, offering API credits to top submissions and seeking talent from participants. AI
RESEARCH · Lil'Log (Lilian Weng) English(EN) · 32mo · [3 sources]

Adversarial Attacks on LLMs

Researchers are developing new methods to enhance the safety and robustness of large language models against adversarial attacks. These attacks, often in the form of carefully crafted prompts, aim to bypass built-in safety mechanisms and elicit undesirable outputs. Efforts include creating guardrails like AprielGuard and developing leaderboards to track and improve model security against such vulnerabilities. AI
RESEARCH · Hugging Face Blog English(EN) · 32mo

The N Implementation Details of RLHF with PPO

This blog post delves into the technical intricacies of implementing Reinforcement Learning from Human Feedback (RLHF) using the Proximal Policy Optimization (PPO) algorithm. It provides a deep dive into the practical aspects and challenges encountered when applying PPO for fine-tuning language models. The content aims to offer developers a comprehensive guide to successfully integrating RLHF into their model training pipelines. AI
COMMENTARY · Latent Space Podcast English(EN) · 32mo

The End of Finetuning — with Jeremy Howard of Fast.ai

Jeremy Howard of Fast.ai, a prominent voice in machine learning, discussed the evolution of fine-tuning techniques in a recent podcast. He highlighted how his 2018 ULMFiT paper, which demonstrated the effectiveness of fine-tuning pre-trained language models, was initially met with skepticism. Despite the current widespread adoption of fine-tuning, Howard suggests that the approach may be flawed due to issues like catastrophic forgetting and memorization. AI
TOOL · OpenAI News English(EN) · 32mo

DALL·E 3 is now available in ChatGPT Plus and Enterprise

OpenAI has integrated its DALL·E 3 image generation model into ChatGPT Plus and Enterprise subscriptions. This allows users to create and refine unique images directly within a conversational interface, leveraging detailed prompts for more accurate and visually striking results. The model demonstrates improved capabilities in rendering intricate details like text and hands, and OpenAI has implemented a multi-tiered safety system to prevent the generation of harmful content. AI
- Enterprise
- OpenAI
- DALL·E 3
- ChatGPT Plus
- ChatGPT
RESEARCH · Hugging Face Blog English(EN) · 32mo · [220 sources]

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

Recent research explores novel methods to enhance the reasoning capabilities and efficiency of large language models (LLMs). Papers introduce techniques like speculative exploration for Tree-of-Thought reasoning to break synchronization bottlenecks and achieve significant speedups. Other work focuses on improving tool-integrated reasoning by pruning erroneous tool calls at inference time and developing frameworks for robots to perform physical reasoning in latent spaces before acting. Additionally, research investigates the effectiveness of different reasoning protocols, such as debate and voting, for LLMs, finding that while some methods improve safety, they don't always enhance usefulness. AI

IMPACT New methods for efficient reasoning and tool integration could enhance LLM performance and applicability in complex tasks.
- RunAgent
- Token Arena
- CoSMo
- Qwen 2.5
- Llama 3
- arXiv
- MENTAT
- Hugging Face
- LLM
- Llama 3.1 8B Instruct
- QbitAI
- Tree-of-Thought
- SPEX
- PruneTIR
- LaST-R1
- Meta
- Mistral 3 8B Instruct
TOOL · OpenAI News English(EN) · 32mo

Simplifying contract reviews with AI

Ironclad has launched AI Assist™, a new feature for its contract lifecycle management platform that leverages OpenAI's GPT-4 technology. This tool automates the review of legal contracts, identifying and redlining irregularities significantly faster than manual processes. AI Assist™ also offers pre-approved clauses and supports text prompting, aiming to enhance legal team efficiency without replacing human professionals. The feature has seen rapid adoption and positive customer feedback, demonstrating AI's transformative potential in the legal sector. AI
TOOL · OpenAI News English(EN) · 32mo

Evolving online forms into dynamic data

Typeform has launched Formless, a new AI-powered platform that transforms traditional online forms into dynamic, conversational data collection experiences. Built using OpenAI's GPT-3.5 Turbo and GPT-4 models, Formless allows users to provide instructions instead of designing a form structure, with the AI generating conversational questions based on responses. The platform offers features like AI-driven analysis, personalized brand tone, multilingual support, and the ability to query collected data through natural language, aiming to make data gathering more intuitive and insightful. AI
TOOL · Replit blog English(EN) · 32mo

Replit’s new AI Model now available on Hugging Face

Replit has released its new code generation language model, Replit Code V1.5 3B, on Hugging Face. This model is trained on a massive dataset of permissively licensed code and publicly available developer content, aiming to provide high-quality code completion. Replit is making this model freely available to its community of over 25 million developers, encouraging its use as a foundational model for further fine-tuning and application development. AI

IMPACT Provides developers with a powerful, freely available code generation model that can be fine-tuned for specific applications.
RESEARCH · OpenAI News English(EN) · 32mo

DALL·E 3 system card

OpenAI has released a system card for DALL·E 3, detailing its capabilities and the steps taken to prepare it for deployment. The new image generation model improves upon DALL·E 2 by offering enhanced caption fidelity and overall image quality. OpenAI's system card outlines their efforts in red teaming, risk evaluation, and the implementation of mitigations to reduce unwanted behaviors and potential risks associated with the model. AI
RESEARCH · Hugging Face Blog English(EN) · 32mo

Non-engineers guide: Train a LLaMA 2 chatbot

Hugging Face has released a guide aimed at non-engineers to train a LLaMA 2 chatbot. The guide provides a step-by-step process, making it accessible for individuals without extensive technical backgrounds. It covers the essential aspects of chatbot training using the LLaMA 2 model, enabling a broader audience to engage with AI development. AI
RESEARCH · Hugging Face Blog Bahasa(ID) · 33mo

Llama 2 on Amazon SageMaker a Benchmark

Meta's Llama 2 model is now available on Amazon SageMaker, offering a new benchmark for performance on the cloud platform. This integration allows developers to leverage Llama 2's capabilities within the SageMaker environment, potentially streamlining AI development and deployment workflows. The benchmark results highlight the efficiency and effectiveness of running large language models on AWS infrastructure. AI
- Llama 2
- Amazon SageMaker
- AWS
- Meta
RESEARCH · OpenAI News English(EN) · 33mo

GPT-4V(ision) system card

OpenAI has released a system card detailing the safety properties of its GPT-4V model, which can analyze image inputs. This multimodal capability is seen as a significant advancement in AI research, expanding the potential applications of large language models. The system card elaborates on the evaluations, preparations, and mitigation strategies implemented to ensure the safe handling of image data within GPT-4V. AI
- GPT-4
- OpenAI
- GPT-4V
TOOL · Practical AI English(EN) · 33mo

Automate all the UIs!

Dominik Klotz of askui discussed the potential of AI to automate user interfaces across any operating system. The conversation explored how generative AI, large language models, and computer vision are being integrated to achieve this broad automation capability. This approach aims to enable automation for a wide range of use cases by understanding and interacting with UIs programmatically. AI
RESEARCH · OpenAI News English(EN) · 33mo · [2 sources]

Advancing red teaming with people and AI

OpenAI has announced new initiatives to enhance AI safety through red teaming, a process of using people and AI to identify potential risks in new systems. The company is sharing two papers detailing their approach to external red teaming and introducing a new method for automated red teaming. Additionally, OpenAI is launching a Red Teaming Network to formally recruit domain experts from diverse backgrounds to collaborate on evaluating and improving the safety of their AI models throughout the development lifecycle. AI
TOOL · Hugging Face Blog English(EN) · 33mo

Optimizing your LLM in production

Hugging Face has released a guide detailing methods for optimizing Large Language Models (LLMs) for production environments. The guide covers techniques such as quantization, pruning, and knowledge distillation to reduce model size and improve inference speed. It also discusses efficient serving strategies and hardware considerations for deploying LLMs effectively. The aim is to help developers make LLMs more practical and cost-efficient for real-world applications. AI
RESEARCH · Hugging Face Blog Deutsch(DE) · 33mo

Fine-tuning Llama 2 70B using PyTorch FSDP

Hugging Face has released a guide detailing how to fine-tune Meta's Llama 2 70B model using PyTorch's Fully Sharded Data Parallel (FSDP) feature. This method significantly reduces memory requirements, enabling the fine-tuning process on more accessible hardware. The guide emphasizes efficient training techniques to make large language model customization more feasible for a wider range of users and researchers. AI
RESEARCH · Lil'Log (Lilian Weng) English(EN) · 33mo · [16 sources]

Diffusion Models for Video Generation

Researchers are exploring advanced diffusion models for video generation, addressing challenges like temporal consistency and data scarcity. New methods focus on improving parameterization, such as the v-prediction technique, and incorporating conditional sampling for tasks like extending video length or filling missing frames. Efforts are also underway to enhance efficiency and controllability through post-training frameworks, hybrid attention mechanisms, and semantic-visual adaptation, aiming for real-time generation and higher quality outputs. AI

IMPACT Advances in diffusion models are improving video generation quality, efficiency, and controllability, potentially enabling new applications in content creation and analysis.
RESEARCH · Hugging Face Blog English(EN) · 33mo · [2 sources]

Exploring simple optimizations for SDXL

Hugging Face has released new techniques to optimize Stable Diffusion XL (SDXL) for more efficient image generation. One method focuses on general performance improvements, while another introduces T2I-Adapters for enhanced controllable generation. These advancements aim to make SDXL more accessible and versatile for users. AI
TOOL · Latent Space Podcast English(EN) · 33mo

The Point of LangChain — with Harrison Chase of LangChain

LangChain has launched LangChain Hub, a platform for developers to discover use cases and prompts, accessible to all LangSmith users. The open-source framework, created in October 2022, has rapidly become a popular tool for building AI applications, particularly those involving Retrieval Augmented Generation (RAG). Despite facing critiques and evolving LLM capabilities from frontier labs like OpenAI, LangChain's modular design has allowed it to remain relevant by adapting to new features such as chat APIs and function calling. AI
TOOL · OpenAI News English(EN) · 33mo

Join us for OpenAI’s first developer conference on November 6 in San Francisco

OpenAI has announced its inaugural developer conference, DevDay, scheduled for November 6, 2023, in San Francisco. The event aims to bring together hundreds of developers globally to preview new tools and foster idea exchange. Attendees will have access to breakout sessions led by OpenAI's technical staff, while a keynote will be livestreamed for a wider audience. This conference highlights OpenAI's commitment to empowering developers, noting that over 2 million developers currently utilize their API for integrating advanced AI models like GPT-4 and DALL-E into various applications. AI
RESEARCH · Hugging Face Blog English(EN) · 33mo

Spread Your Wings: Falcon 180B is here

Technology Innovation Institute (TII) has released Falcon 180B, a new large language model, making it available on Hugging Face. This model boasts 180 billion parameters and is designed for research and commercial use. Falcon 180B is noted for its strong performance on various benchmarks, positioning it as a significant open-source alternative in the LLM landscape. AI
RESEARCH · Hugging Face Blog English(EN) · 33mo

AudioLDM 2, but faster ⚡️

Hugging Face has released an optimized version of AudioLDM 2, a text-to-audio generation model. This updated version significantly improves inference speed, making it more practical for real-time applications. The enhancements allow for faster generation of high-quality audio samples directly from text prompts. AI