PulseAugur / Brief
EN
LIVE 22:27:00

Brief

last 24h
[50/2980] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

    Hugging Face has launched LiveCodeBench, a new leaderboard designed to evaluate code-generating large language models (LLMs) more effectively. This benchmark aims to provide a contamination-free assessment by using live coding environments, ensuring that models are tested on their ability to generate correct and functional code rather than memorized solutions. The leaderboard will track performance across various coding tasks, offering a more reliable measure of a code LLM's true capabilities. AI

    Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs
  2. Pile-T5

    EleutherAI has released Pile-T5, an updated version of the T5 language model. This new iteration was trained on the Pile dataset and utilizes the LLaMA tokenizer, addressing weaknesses in the original T5's handling of code and its pretraining data. Pile-T5 was trained for twice as many tokens as the original T5 and demonstrates significant performance improvements, particularly on code-related tasks, outperforming widely used T5 models even when matched for token count. AI

    Pile-T5
  3. Zero to GPT in 1 Year

    Smol AI has released a new open-source language model named "GPT-1" that was trained in just one year. This model is designed to be highly efficient and accessible, aiming to democratize access to powerful AI capabilities. The project highlights the rapid progress in AI development and the potential for smaller, more agile teams to contribute significantly to the field. AI

  4. Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention

    Meta AI has released MTIAv2, a new family of models designed for efficient training and inference. These models are optimized for Meta's MTIA hardware, aiming to improve performance and reduce computational costs. The release includes various sizes to cater to different application needs, showcasing Meta's continued investment in custom AI infrastructure. AI

  5. Music's Dall-E moment

    A new AI model called "MusicLM" has been developed by Google Research that can generate music from text descriptions. This model is capable of producing high-fidelity music in various genres and styles, responding to prompts like "calming jazz for studying" or "80s electronic dance music." MusicLM works by converting text prompts into musical pieces, demonstrating a significant advancement in AI-driven music creation. The research paper detailing MusicLM highlights its potential to revolutionize how music is composed and experienced. AI

  6. Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B

    Hugging Face has released a new model, DuckDB-NSQL-7B, designed for text-to-SQL tasks. This model integrates with the Hugging Face Dataset Viewer API and Motherduck's DuckDB, enabling users to query databases using natural language. The integration aims to simplify data analysis by allowing direct interaction with data through conversational prompts. AI

    Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B
  7. Navigating the challenges and opportunities of synthetic voices

    OpenAI has previewed its Voice Engine model, capable of generating natural-sounding speech from a 15-second audio sample. The technology, developed in late 2022, has been used internally for features like ChatGPT Voice and is being tested with partners for applications in education, content translation, and assistive communication. OpenAI is proceeding cautiously with a broader release due to potential misuse, aiming to foster dialogue on responsible deployment. AI

    Navigating the challenges and opportunities of synthetic voices
  8. Jamba: Mixture of Architectures dethrones Mixtral

    Researchers have introduced Jamba, a novel neural network architecture that combines aspects of recurrent neural networks (RNNs) and transformers. This hybrid approach aims to achieve the efficiency of RNNs while retaining the performance capabilities of transformers. Early evaluations suggest Jamba outperforms existing models like Mixtral on various benchmarks, indicating a potential new direction for efficient large language model design. AI

  9. Mamba Explained

    Mamba, a new State Space Model (SSM), presents an alternative to the dominant Transformer architecture in AI. It aims to match Transformer performance and scaling laws while efficiently handling extremely long sequences, potentially up to one million tokens. This is achieved by removing the quadratic bottleneck found in Transformer attention mechanisms, allowing for faster inference and linear scaling with sequence length. Mamba has demonstrated state-of-the-art results across various modalities, including language, audio, and genomics, outperforming Transformers of similar or even larger sizes. AI

    Mamba Explained
  10. DBRX: Best open model (just not most efficient)

    Databricks has released DBRX, an open-source large language model that rivals proprietary models in performance. While it excels in many benchmarks, it is not the most efficient in terms of computational resources. DBRX is trained on a substantial dataset of 13 trillion tokens and features a unique mixture-of-experts architecture. AI

  11. Yi-34B, Llama 2, and common practices in LLM training: a fact check of the New York Times

    EleutherAI has fact-checked a New York Times article claiming that the Chinese AI model Yi-34B is heavily reliant on Meta's Llama 2. EleutherAI states that the similarities are due to common architectural building blocks used across all modern large language models, not direct technological dependence. The article highlighted a Hugging Face issue where Yi-34B's component naming caused compatibility problems with Llama 2-focused code, which EleutherAI explains as a standard open-source interoperability challenge, not evidence of concealed reliance. AI

    Yi-34B, Llama 2, and common practices in LLM training: a fact check of the New York Times
  12. Pollen-Vision: Unified interface for Zero-Shot vision models in robotics

    Hugging Face has introduced Pollen-Vision, a new unified interface designed to streamline the use of zero-shot vision models in robotics. This development aims to simplify how robots can understand and interact with their environment by leveraging advanced AI capabilities. The interface is expected to accelerate research and development in embodied AI by making these powerful models more accessible and easier to integrate into robotic systems. AI

    Pollen-Vision: Unified interface for Zero-Shot vision models in robotics
  13. Sora first impressions

    OpenAI has shared early impressions of its Sora text-to-video model from visual artists, designers, and filmmakers. These creatives are exploring how Sora can aid their work, enabling them to bring impossible or surreal ideas to life and overcome limitations of time and budget. Artists highlighted Sora's potential for abstract expressionism and visualizing concepts, noting it opens new avenues for storytelling and rapid iteration in their creative processes. AI

    Sora first impressions
  14. GaLore: Advancing Large Model Training on Consumer-grade Hardware

    Hugging Face has introduced GaLore, a new technique designed to enable the training of large language models on consumer-grade hardware. This method utilizes a novel approach to memory management, allowing for the efficient handling of massive datasets and model parameters. GaLore aims to democratize access to large model training, making it more feasible for researchers and developers with limited resources. AI

    GaLore: Advancing Large Model Training on Consumer-grade Hardware
  15. Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

    Researchers have introduced Cosmopedia, a novel method for generating large-scale synthetic data specifically designed for pre-training Large Language Models (LLMs). This approach aims to address the growing need for high-quality, diverse datasets that are crucial for advancing LLM capabilities. The development of Cosmopedia could significantly impact the efficiency and effectiveness of future LLM training. AI

    Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models
  16. A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake

    Intel and Hugging Face have partnered to enable Meta's Phi-2 language model to run efficiently on Intel's Meteor Lake processors. This collaboration allows for on-device AI capabilities, bringing chatbots and other AI applications directly to laptops without relying on cloud servers. The integration leverages Intel's OpenVINO toolkit to optimize the model's performance for local execution. AI

    A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake
  17. Grok-1 in Bio

    The Grok-1 large language model has been made available for biological research applications. This release aims to accelerate discoveries in the life sciences by providing researchers with a powerful AI tool. The model's capabilities are expected to aid in areas such as drug discovery and genomic analysis. AI

  18. Measuring the impact of post-training enhancements

    Researchers at METR have conducted experiments to measure the impact of post-training enhancements on AI agent capabilities, using versions of OpenAI's GPT-3.5 Turbo and GPT-4. Their findings indicate that OpenAI's own post-training efforts significantly boosted agent performance by 26 percentage points, a gain comparable to the jump from GPT-3.5 to GPT-4. While their own attempts to further improve agent performance through tweaked prompting and tools yielded smaller, statistically insignificant gains, the study suggests that dramatically increasing a model's dangerous capabilities after it has been competently fine-tuned may be challenging, though further research is needed. AI

  19. Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

    Hugging Face has released the WebSight dataset, which contains over 100,000 pairs of web screenshots and their corresponding HTML code. This dataset aims to facilitate the development of AI models capable of converting visual web page designs into functional HTML. The initiative seeks to improve the efficiency and accessibility of web development by enabling AI to understand and generate web structures from visual inputs. AI

    Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
  20. Making Transformers Sing - with Mikey Shulman of Suno

    Suno, a company founded by former Kensho employees who are also musicians, has developed advanced AI models for audio generation, moving beyond traditional text-to-speech. Their initial open-source model, Bark, demonstrated capabilities in generating speech, music, and sound effects by training on broad audio data rather than limited text-to-speech datasets. Suno's subsequent product, which gained significant attention in December 2023, aims to democratize music creation, allowing anyone to become a music maker. AI

    Making Transformers Sing - with Mikey Shulman of Suno
  21. Generating the future of art & entertainment

    Runway, an applied AI research company, is significantly impacting the future of art and entertainment with its advanced text-to-video models. Co-founder and CTO Anastasis Germanidis discussed the company's growth and its role in defining the creative landscape. Runway's work focuses on leveraging AI to enhance human creativity in media production. AI

    Generating the future of art & entertainment
  22. Top 5 Research Trends + OpenAI Sora, Google Gemini, Groq Math (Jan-Feb 2024 Audio Recap) + Latent Space Anniversary with Lindy.ai, RWKV, Pixee, Julius.ai, Listener Q&A!

    The Latent Space podcast released an audio recap covering AI research trends and industry news from January and February 2024. Key topics discussed include advancements in long inference, synthetic data generation, alternative model architectures like Mamba and RWKV, Mixture of Experts models, and online LLMs. The recap also featured discussions on OpenAI's Sora video generation model, Google's Gemini Pro 1.5 with its 1 million token context window, and the performance of Groq's inference engine. AI

    Top 5 Research Trends + OpenAI Sora, Google Gemini, Groq Math (Jan-Feb 2024 Audio Recap) + Latent Space Anniversary with Lindy.ai, RWKV, Pixee, Julius.ai, Listener Q&A!
  23. Inflection-2.5 at 94% of GPT4, and Pi at 6m MAU

    Inflection AI has released Inflection-2.5, a new model that achieves 94% of the performance of OpenAI's GPT-4. The company also reported that its personal AI chatbot, Pi, has reached 6 million monthly active users. This update signifies a notable step forward for Inflection AI in its competition with other leading AI developers. AI

  24. YOLOv9: Computer vision is alive and well

    Researchers have released YOLOv9, a new computer vision model that introduces advancements in parameter-efficient deep learning architectures. This development highlights ongoing progress in computer vision alongside the hype surrounding generative AI. The release also coincides with discussions on parameter-efficient large language models, including Microsoft's 1-Bit LLMs and Qualcomm's new AI Hub. AI

    YOLOv9: Computer vision is alive and well
  25. Using AI to improve patient access to clinical trials

    Paradigm, a healthcare technology company, has partnered with OpenAI to improve patient access to clinical trials. By integrating GPT-4 into their platform, Paradigm has significantly enhanced their ability to match patients with suitable trials, overcoming previous limitations of traditional ML models. This integration has led to a substantial increase in accuracy, a reduction in the time and resources needed for data evaluation, and has accelerated Paradigm's ability to expand its services. AI

    Using AI to improve patient access to clinical trials
  26. Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?

    Hugging Face has introduced ConTextual, a new benchmark designed to evaluate how well multimodal AI models can understand and reason about text within image-rich scenes. This benchmark aims to push the capabilities of models beyond simple object recognition, focusing on their ability to interpret complex visual information that includes significant textual elements. ConTextual will help researchers and developers assess and improve the performance of multimodal systems in real-world scenarios where text and images are intertwined. AI

    Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?
  27. The Era of 1-bit LLMs

    Researchers are exploring the potential of 1-bit Large Language Models (LLMs), which represent a significant departure from traditional models that use multiple bits per parameter. This approach aims to drastically reduce the computational resources and memory required for training and running LLMs. While still in early stages, 1-bit LLMs could pave the way for more efficient and accessible AI. AI

  28. Dia de las Secuelas (StarCoder, The Stack, Dune, SemiAnalysis)

    The StarCoder2 family of models has been released, featuring three distinct sizes: 3B, 7B, and 15B parameters. These models were trained on The Stack v2 dataset, which comprises over 600 programming languages. Developed collaboratively by Hugging Face, ServiceNow, and NVIDIA, StarCoder2 aims to advance code generation capabilities. AI

  29. StarCoder2 and The Stack v2

    Hugging Face has released StarCoder2, a new family of large language models for code generation, trained on a massive dataset called The Stack v2. This dataset comprises over 600 programming languages and includes a significant amount of permissively licensed code. The StarCoder2 models are available in three sizes, with the largest boasting 15 billion parameters, and are designed to advance research and development in AI-powered coding tools. AI

    StarCoder2 and The Stack v2
  30. Mistral Large disappoints

    Mistral Large, a new flagship model from French AI company Mistral AI, has reportedly failed to meet expectations in early evaluations. While details are scarce, the model's performance appears to be underwhelming compared to its predecessors and competitors. This comes as Mistral AI continues to position itself as a major player in the European AI landscape. AI

  31. Ring Attention for >1M Context

    Researchers have developed a novel method called Ring Attention, which significantly expands the context window of large language models to over one million tokens. This technique allows models to process and retain information from much larger inputs than previously possible. The advancement could lead to more capable AI systems that can handle complex documents and extended conversations. AI

  32. Introducing the Red-Teaming Resistance Leaderboard

    Hugging Face has launched a new leaderboard to track the performance of AI models in resisting adversarial attacks. This initiative aims to foster research into AI safety by providing a public platform for evaluating and comparing models' robustness against red-teaming efforts. The leaderboard will highlight models that demonstrate stronger defenses against prompt injection and other manipulation techniques, encouraging the development of more secure AI systems. AI

    Introducing the Red-Teaming Resistance Leaderboard
  33. 🪆 Introduction to Matryoshka Embedding Models

    Hugging Face has introduced Matryoshka embedding models, a novel approach to creating embeddings that can dynamically adjust their dimensionality. These models allow for a trade-off between performance and computational cost, enabling users to select an embedding size that best suits their specific needs. This flexibility makes them suitable for a wide range of applications, from resource-constrained environments to those requiring high-fidelity representations. AI

    🪆 Introduction to Matryoshka Embedding Models
  34. Google AI: Win some (Gemma, 1.5 Pro), Lose some (Image gen)

    Google AI has released Gemma, a family of open models, alongside an update to its Gemini 1.5 Pro model. The Gemma models are available in 2B and 7B parameter sizes and are designed for responsible AI development. However, Google's image generation capabilities have faced criticism and scrutiny. AI

  35. Welcome Gemma 2 - Google’s new open LLM

    Google has released Gemma 2, an updated version of its open large language model. This new iteration offers improved performance and capabilities compared to its predecessor. The model is available for researchers and developers to explore and build upon. AI

    Welcome Gemma 2 - Google’s new open LLM
  36. Sora pushes SOTA

    OpenAI's Sora text-to-video model has reportedly achieved state-of-the-art (SOTA) performance, according to a recent analysis. While details remain scarce, this suggests Sora may be setting new benchmarks in its capabilities. The specific metrics and comparisons that led to this conclusion are not yet publicly available. AI

  37. AI gets Memory

    A new AI model has been developed that can remember past conversations and interactions. This advancement allows the AI to maintain context over extended periods, leading to more coherent and personalized user experiences. The ability to retain memory is a significant step towards more sophisticated and human-like AI assistants. AI

  38. The Dissection of Smaug (72B)

    Smol AI has released Smaug-72B, a new large language model. This model is notable for its performance on various benchmarks, including achieving state-of-the-art results on the MT-Bench leaderboard. Smaug-72B was trained on a dataset of 1.5 trillion tokens and is available for research purposes. AI

  39. How to Generate and Use Synthetic Data for Finetuning

    Synthetic data, generated by models or simulations rather than real-world sources, offers a faster and more cost-effective alternative to human annotation for fine-tuning AI models. This approach can lead to improved model performance and generalization while also mitigating privacy and copyright concerns. Two primary methods for generating synthetic data include distillation from a more capable model and self-improvement techniques where a model refines its own output. These methods can be applied to pretraining, instruction-tuning, and preference-tuning to enhance various aspects of a model's capabilities. AI

    How to Generate and Use Synthetic Data for Finetuning
  40. Qwen 1.5 Released

    Alibaba's Qwen team has released Qwen 1.5, an updated suite of large language models. The models range in size from 0.5 billion to 72 billion parameters and are available in both base and chat-optimized versions. Qwen 1.5 models have demonstrated strong performance on various benchmarks, including MMLU and GSM8K, and are released under an open-source license. AI

  41. Data synthesis for SOTA LLMs

    Nous Research, a collective of LLM researchers, has developed popular open-access models like the Hermes family by employing state-of-the-art data synthesis techniques. In a recent discussion, Karan from Nous elaborated on the group's origins and their effective fine-tuning strategies, highlighting the success of data synthesis in their work. The conversation also touched upon the potential of blockchain technology to address authenticity issues in the digital realm, including AI-generated content and creator compensation. AI

    Data synthesis for SOTA LLMs
  42. AI2 releases OLMo - the 4th open-everything LLM

    AI2 has released OLMo, an open-source large language model. This model is notable for its commitment to full transparency, including the release of its training data, code, and weights. OLMo aims to foster reproducible research and accelerate progress in the field by providing a truly open platform for AI development. AI

  43. SegMoE: Segmind Mixture of Diffusion Experts

    Segmind has introduced SegMoE, a novel Mixture-of-Diffusion-Experts model designed for enhanced image generation. This architecture leverages multiple specialized diffusion models, allowing for more efficient and higher-quality image synthesis. The approach aims to improve performance by dynamically selecting and combining the outputs of these expert models. AI

    SegMoE: Segmind Mixture of Diffusion Experts
  44. Patch Time Series Transformer in Hugging Face

    Hugging Face has released PatchTST, a novel time series transformer model that significantly outperforms previous state-of-the-art models on various benchmarks. PatchTST addresses the limitations of existing transformer architectures in handling long sequences by employing a patching mechanism. This approach allows for more efficient processing and improved performance in time series forecasting tasks. AI

    Patch Time Series Transformer in Hugging Face
  45. Constitutional AI with Open LLMs

    Hugging Face has released a guide detailing how to implement Constitutional AI (CAI) with open large language models (LLMs). This approach allows developers to steer AI behavior using a set of predefined principles, or a "constitution," without requiring extensive human feedback for fine-tuning. The guide provides practical steps and code examples for integrating CAI into open LLM development workflows. AI

    Constitutional AI with Open LLMs
  46. Miqu confirmed to be an early Mistral-medium checkpoint

    The model known as Miqu has been identified as an early iteration of Mistral AI's "Mistral-medium" model. This revelation sheds light on the development lineage of Mistral's more advanced AI systems. Further details regarding its specific architecture or performance characteristics were not provided in the source. AI

  47. Building an early warning system for LLM-aided biological threat creation

    OpenAI has developed a new evaluation method to assess the risk of large language models aiding in the creation of biological threats. Their initial study, involving biology experts and students, found that GPT-4 provided only a mild, statistically insignificant uplift in accuracy for threat creation tasks compared to internet-only access. This research is part of OpenAI's broader Preparedness Framework and aims to contribute to community understanding and the development of safety evaluations for AI-enabled risks. AI

    Building an early warning system for LLM-aided biological threat creation
  48. CodeLLama 70B beats GPT4 on HumanEval

    CodeLLama 70B has surpassed GPT-4 in performance on the HumanEval benchmark, a key measure for evaluating code generation capabilities. This advancement indicates a significant step forward in open-source large language models for programming tasks. The model's achievement highlights the rapid progress being made in the field, particularly in specialized AI domains. AI

  49. RWKV "Eagle" v5: Your move, Mamba

    The RWKV Foundation has released Eagle v5, a new iteration of its open-source large language model. This version aims to compete with other advanced models like Mamba, which has gained attention for its efficiency. Eagle v5 is presented as a significant development in the open-source AI community, offering an alternative to proprietary systems. AI

  50. Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

    Hugging Face has released optimizations for the StarCoder language model, enabling it to run more efficiently on Intel Xeon processors. These optimizations include quantization techniques like Q8 and Q4, which reduce the model's size and computational requirements. Additionally, speculative decoding is implemented to further enhance inference speed, making StarCoder more accessible for deployment on a wider range of hardware. AI

    Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding