PulseAugur / Brief
EN
LIVE 19:57:48

Brief

last 24h
[50/8391] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Deploy MusicGen in no time with Inference Endpoints

    Hugging Face has released a new feature called Inference Endpoints, designed to simplify the deployment of AI models. This tool allows users to quickly set up an API for models like MusicGen, enabling easy integration and usage. The goal is to make advanced AI capabilities more accessible for developers and businesses. AI

    Deploy MusicGen in no time with Inference Endpoints
  2. Towards Encrypted Large Language Models with FHE

    Researchers have developed a method to run large language models using fully homomorphic encryption (FHE), allowing computations on encrypted data without decryption. This breakthrough enables privacy-preserving AI applications where sensitive user data can be processed securely. The approach integrates FHE techniques with existing LLM architectures, paving the way for confidential AI services. AI

    Towards Encrypted Large Language Models with FHE
  3. Open-sourcing Knowledge Distillation Code and Weights of SD-Small and SD-Tiny

    Hugging Face has released the code and weights for SD-Small and SD-Tiny, two smaller versions of its Stable Diffusion model. These models were created using knowledge distillation, a technique that trains a smaller model to mimic the behavior of a larger one. The goal is to make powerful image generation models more accessible and efficient for researchers and developers. AI

    Open-sourcing Knowledge Distillation Code and Weights of SD-Small and SD-Tiny
  4. Building Secure AI Gateways with MLflow AI Gateway

    Google Research has introduced ReasoningBank, a novel framework designed to enhance AI agents' ability to learn from their experiences, both successes and failures, after deployment. This system distills generalizable reasoning strategies from past interactions, allowing agents to continuously improve and avoid repeating mistakes. Separately, new research explores optimizing multi-agent communication through latent representations and introduces Agent Evolving Learning (AEL) for agents operating in open-ended environments, focusing on how to effectively use remembered information. Additionally, DeepSeek has released preview models of its V4 series, offering large context windows and advanced capabilities at a significantly lower cost than comparable frontier models. AI

    IMPACT New frameworks for agent learning and memory, alongside cost-effective frontier models, could accelerate AI adoption in complex tasks and personalized applications.

  5. There's a new Llama in town

    Meta AI has released Llama 2, a new large language model that is expected to significantly impact the LLM landscape. This release includes a new NeRF model called Zip-NeRF, capable of generating 3D scenes from 2D images. The hosts also discussed new functionalities from OpenAI and compared them with Anthropic's Claude 2. AI

    There's a new Llama in town
  6. Happy 1st anniversary 🤗 Diffusers!

    The Hugging Face Diffusers library celebrated its first anniversary, marking a significant milestone in the open-source AI community. Since its launch, Diffusers has become a pivotal tool for researchers and developers working with diffusion models, enabling easier experimentation and deployment of generative AI applications. The library's success highlights the growing importance of accessible and collaborative platforms for advancing AI research and development. AI

    Happy 1st anniversary 🤗 Diffusers!
  7. Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)

    Meta has released Llama 2, an open-source large language model that has quickly become the state-of-the-art in its weight class, outperforming other open models. The model was pre-trained on 2 trillion tokens with an expanded context length and significant investment in reinforcement learning from human feedback. Llama 2 is available for commercial use, addressing a critical need for organizations that cannot send sensitive data to external API providers and offering an alternative to proprietary models like GPT-4. AI

    Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)
  8. Welcoming Llama Guard 4 on Hugging Face Hub

    Meta AI has released Llama 4, a new family of open-source large language models, available on Hugging Face. This release includes Llama Guard 4, a model specifically designed for safety, and two other models, Maverick and Scout. The availability of these models on Hugging Face Hub facilitates broader access and experimentation within the AI community. AI

    Welcoming Llama Guard 4 on Hugging Face Hub
  9. Open-Source Text Generation & LLM Ecosystem at Hugging Face

    Hugging Face has released an open-source model called "os-llms" designed for text generation. This model aims to foster a more collaborative and accessible ecosystem for large language models. The release emphasizes community involvement and aims to democratize access to powerful AI tools. AI

    Open-Source Text Generation & LLM Ecosystem at Hugging Face
  10. Code Interpreter == GPT 4.5 (w/ Simon Willison, Alex Volkov, Aravind Srinivas, Alex Graveley, et al.)

    The Code Interpreter feature within ChatGPT is being discussed as a significant advancement, potentially equivalent to a GPT 4.5 model. This tool allows ChatGPT to write and execute Python code within a sandboxed environment, enabling it to process uploaded files and utilize a wide array of pre-installed libraries. Initially available to alpha testers, it has now been rolled out as a beta feature to all ChatGPT Plus subscribers. AI

    Code Interpreter == GPT 4.5 (w/ Simon Willison, Alex Volkov, Aravind Srinivas, Alex Graveley, et al.)
  11. Accurately analyzing large-scale qualitative data

    Viable, a company founded in 2020, has developed a platform that leverages OpenAI's GPT-4 to analyze large volumes of qualitative customer data. Unlike simple summarization tools, Viable's system performs in-depth analysis, providing businesses with actionable insights to improve customer satisfaction and product development. This advanced analysis helps companies save significant operational hours and reduce customer churn by extracting nuanced sentiment from unstructured feedback. AI

    Accurately analyzing large-scale qualitative data
  12. Cambrian explosion of generative models

    The "Practical AI" podcast discusses the recent surge in generative models, highlighting open-source advancements like Stable Diffusion XL and Zeroscope XL. Hosts Daniel and Chris predict that open models will eventually dominate the AI landscape, similar to open-source software. They also address the emerging challenges associated with this rapid progress, including cybersecurity risks, impacts on productivity, and broader cultural implications. AI

    Cambrian explosion of generative models
  13. What's going on with the Open LLM Leaderboard?

    The Hugging Face Open LLM Leaderboard has updated its evaluation methodology to include the MMLU benchmark, a comprehensive test of language model knowledge across 57 subjects. This change aims to provide a more robust assessment of model capabilities by incorporating a wider range of academic and professional domains. The leaderboard now uses a weighted average of MMLU scores alongside existing benchmarks to rank open-source large language models. AI

    What's going on with the Open LLM Leaderboard?
  14. Panel on Hugging Face

    Hugging Face has released a new open-source tool called Panel, designed to simplify the creation and deployment of AI applications. Panel integrates with various machine learning frameworks and allows developers to build interactive dashboards and interfaces for their models. This release aims to lower the barrier to entry for deploying AI solutions, making them more accessible to a wider range of users. AI

    Panel on Hugging Face
  15. Commoditizing the Petaflop — with George Hotz of the tiny corp

    George Hotz's company, tiny corp, has launched the tinybox, a $15,000 personal AI computer designed for local model training and inference. The tinybox boasts 738 FP16 TFLOPS and 144 GB of GPU RAM, capable of running a 65B LLaMA model out of the box. Hotz's approach with the tinygrad framework emphasizes a RISC philosophy for efficiency and avoids Turing-complete kernels, aiming to compete with established players like NVIDIA by focusing on developer experience and optimizing for off-the-shelf hardware. AI

    Commoditizing the Petaflop — with George Hotz of the tiny corp
  16. Fine-Tune MMS Adapter Models for low-resource ASR

    Hugging Face has released new adapter models for their MMS (Massively Multilingual Speech) ASR system. These adapters are designed to improve performance on low-resource languages, enabling better speech recognition for a wider range of linguistic communities. The release focuses on making ASR technology more accessible and effective for languages with limited existing training data. AI

    Fine-Tune MMS Adapter Models for low-resource ASR
  17. Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)

    Researchers have demonstrated the effectiveness of Transformer models for time series forecasting tasks. The Autoformer architecture, specifically designed for this purpose, shows strong performance by decomposing the time series into seasonal and trend components. This approach allows for more accurate predictions by handling complex temporal dependencies. AI

    Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)
  18. AI trends: a Latent Space crossover

    The Latent Space podcast hosted a crossover episode discussing the rapid evolution of AI over the past year, particularly focusing on the shift from pre-training to inference-time scaling laws. Speakers noted the surprising lack of widespread adoption for open-source models like Llama, despite their availability. The conversation also touched upon the rise of AI agents, the debate between vertical and horizontal AI startups, and the increasing importance of domain-specific model training and agent experience. AI

    AI trends: a Latent Space crossover
  19. Function calling and other API updates

    OpenAI has announced several updates to its API, including enhanced function calling capabilities for GPT-4 and GPT-3.5 Turbo models. These updates allow developers to more reliably connect AI models with external tools and APIs by enabling the models to output structured JSON for function arguments. Additionally, OpenAI is extending support for older model versions until June 2024 and has implemented cost reductions for its embeddings model and GPT-3.5 Turbo input tokens. AI

    Function calling and other API updates
  20. Can foundation models label data like humans?

    Hugging Face's Open LLM Leaderboard is exploring the use of large language models (LLMs) for data labeling, aiming to replicate human-level accuracy. This approach could significantly speed up and reduce the cost of data annotation for training AI models. The blog post discusses the potential and challenges of using LLMs in this capacity, particularly in comparison to traditional human annotators. AI

    Can foundation models label data like humans?
  21. From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude

    Amplitude, a company known for its product analytics, is focusing heavily on integrating AI into its offerings. They are exploring methods beyond traditional Reinforcement Learning from Human Feedback (RLHF), which relies on explicit, often costly, and potentially biased user input. Instead, Amplitude advocates for learning from real user behavior within products, citing examples like GitHub Copilot and Midjourney, where implicit feedback is gathered naturally through user interaction. This approach aims to provide more authentic and cost-effective data for training AI models, potentially making AI analytics more crucial than AI itself. AI

    From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude
  22. What will GPT-2030 look like?

    A new analysis projects that by 2030, large language models like a hypothetical "GPT2030" could surpass human capabilities in areas such as coding, math, and scientific design. This future model is expected to operate significantly faster than humans and be capable of massive parallelization, allowing for the execution of millions of human-equivalent years of work. Furthermore, GPT2030 might integrate diverse data modalities beyond text and images, leading to novel conceptual understanding and accelerating research while also posing substantial risks for misuse, particularly in cybersecurity and information manipulation. AI

    What will GPT-2030 look like?
  23. Replit AI Manifesto

    Replit is making its AI coding assistance features available to all 23 million developers on its platform, including those on the free tier. The company is also releasing a new 3-billion parameter Large Language Model, replit-code-v1.5-3b, trained on 1 trillion tokens with a focus on code and programming languages. Replit aims to integrate AI deeply into every aspect of its platform, eventually redefining the entire software development lifecycle for its users. AI

    Replit AI Manifesto

    IMPACT Accelerates AI integration into software development, making advanced coding tools accessible to a broader developer base.

  24. The Falcon has landed in the Hugging Face ecosystem

    The Falcon large language model has been integrated into the Hugging Face ecosystem. This integration makes the model more accessible to developers and researchers. Falcon is known for its strong performance on various benchmarks and its open-source nature. AI

    The Falcon has landed in the Hugging Face ecosystem
  25. Improving mathematical reasoning with process supervision

    OpenAI has developed a new method called process supervision to improve AI's mathematical reasoning capabilities. This technique rewards each correct step in a problem-solving process, rather than just the final answer, leading to better performance and reduced hallucinations. The company found that process supervision not only enhances accuracy but also offers alignment benefits by directly training models to produce human-endorsed reasoning chains. OpenAI has released its dataset to encourage further research into this promising approach. AI

    Improving mathematical reasoning with process supervision
  26. 🐶Safetensors audited as really safe and becoming the default

    The safetensors library, developed by Hugging Face in collaboration with EleutherAI and Stability AI, has undergone a security audit by Trail of Bits, confirming its safety. This audit allows the organizations to move towards making safetensors the default format for saving and loading machine learning models, replacing the less secure pickle format used by PyTorch. The library offers benefits such as faster loading times and lazy loading capabilities, and will now be installed by default in the transformers library. AI

    🐶Safetensors audited as really safe and becoming the default
  27. MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

    MosaicML has released MPT-7B, an open-source transformer model trained on one trillion tokens that matches LLaMA-7B's quality and is commercially licensed. This model boasts an impressive context length of up to 84,000 tokens, significantly exceeding limitations found in models like GPT-3. MosaicML also open-sourced its LLM Foundry codebase used for training and evaluation, alongside three fine-tuned versions of MPT-7B, including one specialized for long-form storytelling. AI

    MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML
  28. Creating instruction tuned models

    Erin Mikail Staples discussed the creation of instruction-tuned Large Language Models at ODSC East. The conversation focused on the critical role of human feedback in this process. Staples also highlighted the significance of open data and practical tools for data annotation and fine-tuning custom generative AI models. AI

    Creating instruction tuned models
  29. Smaller is better: Q8-Chat, an efficient generative AI experience on Xeon

    Hugging Face has released Q8-Chat, a new generative AI model optimized for Intel Xeon CPUs. This model aims to provide an efficient AI experience directly on standard server hardware without requiring specialized GPUs. The development focuses on making powerful AI capabilities more accessible and cost-effective for a wider range of applications. AI

    Smaller is better: Q8-Chat, an efficient generative AI experience on Xeon
  30. Run a Chatgpt-like Chatbot on a Single GPU with ROCm

    Hugging Face has released a new guide detailing how to run a ChatGPT-like chatbot on a single AMD GPU using ROCm. This enables users with consumer-grade hardware to deploy powerful conversational AI models locally. The guide focuses on optimizing performance and accessibility for individuals and smaller organizations. AI

    Run a Chatgpt-like Chatbot on a Single GPU with ROCm
  31. RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious

    The RWKV (Receptance Weighted Key Value) project introduces a novel architecture that revives Recurrent Neural Networks (RNNs) while incorporating advantages typically found in Transformers. This approach aims to overcome the scaling limitations of traditional Transformers, particularly in training and inference, while maintaining competitive performance on reasoning benchmarks. The RWKV project is characterized by its distributed, international, and largely volunteer-driven community, drawing parallels to early EleutherAI efforts. AI

    RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious
  32. Creating a Coding Assistant with StarCoder

    Hugging Face has released StarCoder, a new large language model specifically trained for code generation. This model is built on the StarChat architecture and has been trained on a massive dataset of permissively licensed code from GitHub. StarCoder aims to provide developers with a powerful and accessible tool for various coding tasks. AI

    Creating a Coding Assistant with StarCoder
  33. Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit

    Replit has open-sourced its new code-focused large language model, replit-code-v1-3b. This model, which is significantly smaller than OpenAI's Codex, reportedly outperforms it on the HumanEval benchmark when fine-tuned on Replit's data. The release was discussed in an interview with Replit's Head of AI, Reza Shabani, who detailed the journey of training the model and its potential applications for developers. AI

    Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit
  34. Large models on CPUs

    Mark Kurtz discusses the significant advancements in optimizing large AI models for CPU inference, highlighting that a substantial portion of model parameters often do not impact outputs. This optimization work, particularly through tools like Neural Magic's SparseML and SparseGPT, enables running complex generative AI models on standard hardware, reducing the reliance on expensive GPUs and making AI more accessible. AI

    Large models on CPUs
  35. Mapping the future of *truly* Open Models and Training Dolly for $30 — with Mike Conover of Databricks

    Databricks has released Dolly 2.0, an instruction-following large language model that is fully open source and commercially viable. Unlike LLaMA, Dolly 2.0's license permits business use, addressing a key limitation of previous open models. The model was fine-tuned on a human-generated instruction dataset and can be customized for specific data and styles, with Databricks offering a notebook to facilitate this process for approximately $30 in 30 minutes. AI

    Mapping the future of *truly* Open Models and Training Dolly for $30 — with Mike Conover of Databricks
  36. Running IF with 🧨 diffusers on a Free Tier Google Colab

    Hugging Face has released a guide on how to run the new open-source IF (Image-to-Image) model using their diffusers library on a free tier Google Colab instance. This allows users to experiment with the model's capabilities without requiring powerful local hardware. The guide provides practical steps for setting up the environment and running inference, making advanced image generation accessible to a wider audience. AI

    Running IF with 🧨 diffusers on a Free Tier Google Colab
  37. A Recap of Replit Developer Day

    Replit has announced significant platform updates and new AI capabilities at its annual Developer Day. The company is expanding its offerings to teams with the launch of Replit Teams, designed to enhance collaboration and streamline development workflows. Additionally, Replit introduced Code Repair, an AI model that automates debugging and reportedly outperforms leading models like GPT-4 Turbo and Claude 3 Opus on specific benchmarks. The platform also unveiled improvements to its Workspace, including increased RAM and CPU limits, enhanced security for extensions, and production-grade deployments powered by Google Cloud Platform. AI

    A Recap of Replit Developer Day

    IMPACT Accelerates team-based AI-assisted software development and introduces a new AI debugging tool.

  38. Graph Classification with Transformers

    Hugging Face has released a new blog post detailing how to perform graph classification tasks using Transformer models. The post provides a practical guide, likely aimed at researchers and developers, on leveraging the power of Transformers for analyzing graph-structured data. This approach could open new avenues for applying advanced deep learning techniques to domains where graph data is prevalent. AI

    Graph Classification with Transformers
  39. Segment Anything Model and the Hard Problems of Computer Vision — with Joseph Nelson of Roboflow

    Meta AI has released its Segment Anything Model (SAM), a significant advancement in computer vision, which includes the model, weights, data, and a demo website. This open-source release is notable for its extensive dataset, containing significantly more images and masks than previous datasets. The podcast features Joseph Nelson of Roboflow discussing SAM's capabilities, including its zero-shot transfer and promptability, and demonstrating its integration into Roboflow's platform. The discussion also touches upon the broader landscape of multimodal AI and the remaining challenges in computer vision. AI

    Segment Anything Model and the Hard Problems of Computer Vision — with Joseph Nelson of Roboflow
  40. StackLLaMA: A hands-on guide to train LLaMA with RLHF

    Hugging Face has released StackLLaMA, an open-source model trained on code and natural language. This model is designed to assist developers with coding tasks, offering capabilities such as code generation and explanation. The release aims to provide a powerful, accessible tool for the AI development community. AI

    StackLLaMA: A hands-on guide to train LLaMA with RLHF
  41. Exploratory Analysis of TRLX RLHF Transformers with TransformerLens

    Researchers have demonstrated a method for training and analyzing language models using Reinforcement Learning from Human Feedback (RLHF). The process involves using the TRLX library for RLHF fine-tuning and TransformerLens for mechanistic interpretability. This approach was used to fine-tune a GPT-2 model to generate negatively biased movie reviews and then analyze the model to identify specific network regions responsible for this behavior. AI

    Exploratory Analysis of TRLX RLHF Transformers with TransformerLens
  42. Fast Inference on Large Language Models: BLOOMZ on Habana Gaudi2 Accelerator

    Hugging Face has released a new guide detailing how to achieve fast inference for large language models like BLOOMZ using Habana Gaudi2 accelerators. The guide provides practical steps and optimizations for developers looking to leverage this hardware for efficient LLM deployment. This collaboration aims to make powerful AI models more accessible and performant on specialized hardware. AI

    Fast Inference on Large Language Models: BLOOMZ on Habana Gaudi2 Accelerator
  43. Building Ghostwriter Chat

    Replit has launched Ghostwriter Chat, an AI pair programmer integrated directly into its IDE. This tool aims to provide developers with coding assistance, error debugging, and contextual answers without requiring them to leave their workspace. The feature was developed rapidly during a company hackweek and is designed to leverage LLMs by streaming responses for low latency and intelligently constructing prompts using repl context and chat history to overcome token limits. AI

    Building Ghostwriter Chat

    IMPACT Accelerates developer workflows by providing integrated AI coding assistance and debugging directly within the IDE.

  44. ARC Evals is now METR

    The Alignment Research Center's (ARC) evaluation team has officially spun off to form a new independent nonprofit organization named METR (Model Evaluation & Threat Research). METR will continue its work on evaluating frontier AI systems, focusing on their autonomous capabilities and potential threats, including AI self-improvement and evasion of oversight. The organization, led by Beth Barnes, has previously partnered with leading AI labs like OpenAI and Anthropic for evaluations and aims to develop rigorous testing methodologies to ensure AI safety before widespread deployment. AI

    ARC Evals is now METR
  45. Powering virtual education for the classroom

    Khan Academy is piloting a new AI-powered assistant called Khanmigo, which utilizes OpenAI's GPT-4 model. This tool is designed to serve as both a virtual tutor for students, offering individualized learning support and prompting deeper understanding, and as a classroom assistant for teachers, aiding in tasks like creating lesson materials and assessing student progress. The nonprofit is proceeding with responsible testing to explore the transformative potential of this technology in education. AI

    Powering virtual education for the classroom
  46. Preserving languages for the future

    Iceland has partnered with OpenAI to leverage GPT-4 for the preservation of the Icelandic language, which is at risk of decline due to digitalization. A team of 40 volunteers is using Reinforcement Learning from Human Feedback (RLHF) to train GPT-4 on proper Icelandic grammar and cultural nuances. This initiative aims not only to safeguard Icelandic but also to create a model for preserving other low-resource languages globally, preventing an "AI divide." AI

    Preserving languages for the future
  47. Transforming visual accessibility

    Be My Eyes is integrating OpenAI's GPT-4 visual input capabilities into its app to create a "Virtual Volunteer" for the blind and visually impaired community. This new feature goes beyond basic image recognition by offering conversational analysis and contextual understanding, allowing users to identify objects, understand recipes from fridge contents, and navigate complex environments. The technology is also being used to improve web accessibility by summarizing cluttered pages and aiding in online shopping, with a beta test showing significant positive results. AI

    Transforming visual accessibility
  48. Filling crucial language learning gaps

    Duolingo has launched a new subscription tier, Duolingo Max, which integrates OpenAI's GPT-4 to enhance language learning. The new features include 'Role Play,' an AI conversation partner for practicing real-world dialogues, and 'Explain my Answer,' which provides contextual feedback on grammatical errors. These AI-powered tools aim to bridge the gap in conversational practice and detailed error correction, offering a more immersive and effective learning experience beyond traditional methods. AI

    Filling crucial language learning gaps
  49. GPT-4

    OpenAI has released GPT-4, a large multimodal model capable of processing both text and image inputs to generate text outputs. This new model demonstrates human-level performance on various professional and academic benchmarks, significantly outperforming its predecessor, GPT-3.5. OpenAI has improved GPT-4's factuality, steerability, and adherence to safety guardrails through extensive alignment work. The model's text capabilities are available via ChatGPT and API, with image input functionality being prepared for wider release. AI

    GPT-4
  50. Stripe

    Stripe has integrated OpenAI's GPT-4 model to enhance its platform, identifying 50 potential AI use cases and prototyping 15 features. The model has proven superior to human reviewers in classifying businesses and is being used to summarize company websites for better customer support. Additionally, GPT-4 is being deployed to assist developers by answering technical questions from documentation and to strengthen fraud detection across Stripe's community workflows. AI

    Stripe