PulseAugur / Brief
EN
LIVE 16:36:34

Brief

last 24h
[50/8376] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Efficient Table Pre-training without Real Data: An Introduction to TAPEX

    Researchers have introduced TAPEX, a novel pre-training method for enhancing table understanding in language models. This approach leverages a "table-to-text" objective, allowing models to generate textual representations of tabular data. TAPEX demonstrates improved performance on various table-related downstream tasks, offering a more efficient way to train models on structured information without requiring extensive real-world datasets. AI

    Efficient Table Pre-training without Real Data: An Introduction to TAPEX
  2. DALL·E 2 research preview update

    OpenAI is expanding access to its DALL-E 2 research preview, inviting up to 1,000 new users weekly from its waitlist. The company has focused on enhancing safety systems, with less than 0.05% of shared images flagged for policy violations. OpenAI is also actively working to address biases in the model inherited from its training data, requesting early users to avoid sharing photorealistic images with faces. AI

    DALL·E 2 research preview update
  3. We Raised $100 Million for Open & Collaborative Machine Learning 🚀

    Hugging Face has secured $100 million in a Series C funding round, led by Salesforce Ventures. This investment aims to accelerate the development of open-source machine learning technologies and foster a more collaborative AI ecosystem. The company plans to expand its platform and community initiatives to support researchers and developers globally. AI

    We Raised $100 Million for Open & Collaborative Machine Learning 🚀
  4. Accelerate Large Model Training using DeepSpeed

    Hugging Face has released new guides detailing how to accelerate the training of large AI models. The guides focus on two key technologies: DeepSpeed and PyTorch's Fully Sharded Data Parallel (FSDP). By implementing these techniques, developers can more efficiently train complex models, potentially reducing computational costs and time. AI

    Accelerate Large Model Training using DeepSpeed
  5. Machine Learning Experts - Lewis Tunstall

    Lewis Tunstall, a prominent figure in the machine learning community and an engineer at Hugging Face, shared insights during an interview. He discussed the evolving landscape of open-source AI models and the importance of community-driven development. Tunstall also touched upon the practical applications and future directions of large language models. AI

    Machine Learning Experts - Lewis Tunstall
  6. Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training

    Habana Labs and Hugging Face have announced a partnership aimed at speeding up the training of transformer models. This collaboration will integrate Habana's Gaudi deep learning accelerators with Hugging Face's popular libraries and platforms. The goal is to provide developers with more efficient tools for training large AI models. AI

    Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training
  7. It's been a BIG week in AI news 🗞

    BigScience is currently training a large language model, attracting significant global attention. Concurrently, NVIDIA has unveiled its newest generation of GPUs, the "Hopper" series. These developments, alongside other AI-related news, were discussed in a recent episode of Practical AI. AI

    It's been a BIG week in AI news 🗞
  8. Fine-Tune a Semantic Segmentation Model with a Custom Dataset

    Hugging Face has published a guide detailing how to fine-tune a semantic segmentation model using a custom dataset. The tutorial focuses on the SegFormer model, demonstrating the process of adapting it for specific segmentation tasks. This guide is intended to help users leverage pre-trained models and tailor them to their unique data requirements. AI

    Fine-Tune a Semantic Segmentation Model with a Custom Dataset
  9. New GPT-3 capabilities: Edit & insert

    OpenAI has introduced new GPT-3 and Codex capabilities that allow for editing and inserting content within existing text, moving beyond simple text completion. The 'insert' feature enables contextually relevant additions in the middle of text or code, improving applications like long-form writing and code generation. Additionally, a new 'edits' endpoint allows for modifications to existing text based on specific instructions, useful for tasks such as refactoring code, changing tone, or fixing errors. These features are now available in beta via the OpenAI API and are being piloted in tools like GitHub Copilot. AI

    New GPT-3 capabilities: Edit & insert
  10. Generating Human-level Text with Contrastive Search in Transformers 🤗

    Hugging Face has introduced two new text generation techniques for its Transformers library: contrastive search and constrained beam search. Contrastive search aims to produce more human-like text by balancing likelihood and distinctiveness, while constrained beam search allows users to guide the generation process with specific rules or patterns. These methods offer developers more control and improved quality for text generation tasks within the Hugging Face ecosystem. AI

    Generating Human-level Text with Contrastive Search in Transformers 🤗
  11. One algorithm to rule them all?

    Researchers have developed an AI system capable of quickly predicting protein attachments, a significant advancement in biological research. Additionally, a new self-supervised algorithm from Meta AI demonstrates high performance across speech, vision, and text modalities. DeepMind has also announced an AI coding engine that matches the proficiency of an average human programmer. AI

    One algorithm to rule them all?
  12. Fine-Tune ViT for Image Classification with 🤗 Transformers

    Hugging Face has released a guide on fine-tuning the Vision Transformer (ViT) model for image classification tasks. The tutorial utilizes the 🤗 Transformers library, demonstrating how to adapt a pre-trained ViT model to a specific dataset. This process allows developers to leverage powerful pre-trained models for custom image recognition applications without training from scratch. AI

    Fine-Tune ViT for Image Classification with 🤗 Transformers
  13. Announcing GPT-NeoX-20B

    EleutherAI has released GPT-NeoX-20B, a 20 billion parameter open-source language model trained using their GPT-NeoX framework. This model is notable for being the largest publicly accessible pretrained autoregressive language model to date. The release aims to facilitate research into the safe use of AI systems, with the model available via inference services and a public release scheduled after a seven-day delay. AI

    Announcing GPT-NeoX-20B
  14. Solving (some) formal math olympiad problems

    OpenAI has developed a neural theorem prover for the Lean formal proof assistant that can solve challenging high-school olympiad math problems. The system utilizes a language model to discover proofs, iteratively improving its performance by using newly found proofs as training data. This approach achieved a new state-of-the-art on the miniF2F benchmark, outperforming previous methods. AI

    Solving (some) formal math olympiad problems
  15. Getting Started With Embeddings

    OpenAI has released new embedding models, text-embedding-3-small and text-embedding-3-large, offering significant improvements in performance and efficiency over previous models like text-embedding-ada-002. These new models are designed to better understand relationships between concepts in text and code, powering applications such as semantic search and retrieval-augmented generation. OpenAI is also reducing prices for GPT-3.5 Turbo and updating its GPT-4 Turbo preview model, while also enhancing API key management and usage transparency for developers. AI

    Getting Started With Embeddings
  16. Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

    Hugging Face has released Infinity, a new inference engine designed to optimize large language model performance on modern CPUs. This engine achieves millisecond latency by leveraging techniques like quantization and efficient memory management. The goal is to make powerful LLMs more accessible and cost-effective for a wider range of applications without requiring specialized hardware. AI

    Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs
  17. Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers

    Hugging Face has released updates to its Transformers library, enhancing the Wav2Vec2 model for automatic speech recognition (ASR). The library now supports processing large audio files by implementing chunking, which breaks down large files into smaller, manageable segments. Additionally, performance is boosted through the integration of n-grams, further improving the accuracy and efficiency of speech recognition tasks. AI

    Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers
  18. Perceiver IO: a scalable, fully-attentional model that works on any modality

    Perceiver IO is a new AI model architecture developed by DeepMind that utilizes a fully attentional mechanism to process information from various modalities. Unlike previous models that required modality-specific input processing, Perceiver IO can handle diverse data types like images, audio, and text directly. This approach aims to create a more scalable and unified framework for multimodal AI research and applications. AI

    Perceiver IO: a scalable, fully-attentional model that works on any modality
  19. Using custom GPTs

    OpenAI has introduced GPTs, a new feature allowing users to create custom versions of ChatGPT tailored for specific tasks or workflows. These custom GPTs can be built without coding by providing instructions, additional knowledge, and defining capabilities like web search or image generation. The company plans to launch a GPT Store later this month, where creators can share their GPTs and potentially monetize them, while also implementing safety measures to review submissions. AI

    Using custom GPTs
  20. Training CodeParrot 🦜 from Scratch

    Hugging Face has released CodeParrot, a new large language model specifically trained for code generation. The model was built from scratch using a novel training approach that emphasizes efficiency and performance. CodeParrot is designed to assist developers by generating code snippets, completing code, and potentially aiding in debugging tasks. AI

    Training CodeParrot 🦜 from Scratch
  21. Introducing Snowball Fight ☃️, our first ML-Agents environment

    Hugging Face has released Snowball Fight, a new machine learning environment designed for training agents. This environment is built using the ML-Agents toolkit and aims to provide a platform for developing and testing AI agents in a simulated setting. The release is intended to foster innovation in reinforcement learning and agent-based AI development within the community. AI

    Introducing Snowball Fight ☃️, our first ML-Agents environment
  22. 2023 Year In Review

    METR, an AI safety research organization, detailed its 2023 accomplishments, including developing methodologies for evaluating AI agents on autonomous tasks and contributing to OpenAI's GPT-4 system card. The organization also proposed "Responsible Scaling Policies" (RSPs), a framework for AI safety that gained traction among researchers and companies like Anthropic and OpenAI. Additionally, METR partnered with the UK AI Safety Institute and evaluated GPT-5.1 for catastrophic risks. AI

    2023 Year In Review
  23. Zero-shot multitask learning

    The BigScience research workshop, a year-long initiative by Hugging Face, has released the T0 family of AI models. These models are specifically designed to explore zero-shot multitask learning in natural language processing. The T0 models demonstrate the potential for AI to generalize across various tasks without explicit training for each one. AI

    Zero-shot multitask learning
  24. Accelerating PyTorch distributed fine-tuning with Intel technologies

    Hugging Face has partnered with Intel to optimize PyTorch distributed fine-tuning using Intel's latest technologies. This collaboration focuses on enhancing performance and efficiency for large language model training. The integration aims to leverage Intel's hardware advancements to accelerate the fine-tuning process, making it more accessible and faster for researchers and developers. AI

    Accelerating PyTorch distributed fine-tuning with Intel technologies
  25. Solving math word problems

    OpenAI has developed a new system capable of solving grade school math word problems with nearly double the accuracy of previous GPT-3 models. This system achieves approximately 90% of the performance of real children in the 9-12 age range by training the model to recognize and correct its own errors through repeated attempts. The approach involves using verifiers to evaluate multiple candidate solutions, selecting the best one, which offers a significant performance boost and appears to scale more effectively with data than simply increasing model size. AI

    Solving math word problems
  26. The Age of Machine Learning As Code Has Arrived

    Hugging Face has announced a new initiative, "Machine Learning as Code," aiming to standardize how machine learning models are developed, shared, and deployed. This approach treats ML models like software code, emphasizing version control, reproducibility, and collaboration. The goal is to streamline the ML lifecycle, making it more accessible and efficient for developers and researchers. AI

    The Age of Machine Learning As Code Has Arrived
  27. Fine tuning CLIP with Remote Sensing (Satellite) images and captions

    Hugging Face has released a guide on fine-tuning the CLIP model using remote sensing images and their corresponding captions. This process involves adapting the pre-trained CLIP model to better understand and associate visual information from satellite imagery with textual descriptions. The guide details the steps and considerations for this specialized application of CLIP, enabling more accurate analysis and retrieval of geospatial data. AI

    Fine tuning CLIP with Remote Sensing (Satellite) images and captions
  28. Summer at Hugging Face

    Hugging Face is hosting a series of events and releasing new features throughout the summer. These initiatives aim to foster community engagement and advance the open-source AI ecosystem. Key highlights include new model releases, educational content, and opportunities for developers to collaborate and showcase their work. AI

    Summer at Hugging Face
  29. Convert Transformers to ONNX with Hugging Face Optimum

    Hugging Face has released Optimum, a new toolkit designed to optimize Transformer models for various hardware accelerators. This initiative includes partnerships with hardware vendors like Graphcore, enabling users to run models more efficiently on specialized hardware such as IPUs. The toolkit supports conversion to ONNX format, further enhancing model performance and deployment flexibility across different platforms. AI

    Convert Transformers to ONNX with Hugging Face Optimum
  30. Deep Learning over the Internet: Training Language Models Collaboratively

    Hugging Face has introduced a new framework enabling collaborative training of large language models over the internet. This approach allows multiple parties to contribute to training without sharing their raw data, addressing privacy and security concerns. The system leverages techniques to ensure that individual data remains private while still enabling the collective model to learn from diverse datasets. AI

    Deep Learning over the Internet: Training Language Models Collaboratively
  31. EleutherAI Second Retrospective: The long version

    EleutherAI has released a retrospective detailing their work over the past year and a half. Key achievements include the development of the open-source LLM GPT-NeoX-20B and contributions to text-to-image generation models like VQGAN-CLIP. The organization has also seen several members depart to found new AI research entities focused on alignment, preference learning, and biomedical applications. AI

    EleutherAI Second Retrospective: The long version
  32. SetFit: Efficient Few-Shot Learning Without Prompts

    Hugging Face has introduced SetFit, a novel few-shot learning approach that achieves state-of-the-art performance without requiring prompt engineering. This method utilizes a two-stage process: first, it fine-tunes a model on a small set of labeled data, and then it generates synthetic data from this fine-tuned model to further train it. SetFit has demonstrated impressive results, outperforming prompt-based methods like few-shot GPT-3 on several benchmarks, and is available as an open-source library. AI

    SetFit: Efficient Few-Shot Learning Without Prompts
  33. Why Release a Large Language Model?

    EleutherAI has detailed its reasoning for releasing large language models, emphasizing that open access is crucial for advancing AI safety research. The organization argues that significant safety studies, particularly in model interpretability, can only be effectively conducted with access to these powerful models. They believe that the potential dangers of current large language models are not world-ending and that releasing them allows for critical safety research to be performed before models become significantly more powerful and potentially uncontrollable. Furthermore, EleutherAI contends that attempts to restrict access to this technology are futile, as well-funded actors can replicate it, making open release the best strategy to empower society to study and utilize it for beneficial purposes. AI

    Why Release a Large Language Model?
  34. On the Sizes of OpenAI API Models

    EleutherAI has estimated the parameter counts of OpenAI's API models by comparing their performance on various tasks to known benchmarks. Their analysis suggests that models like Ada, Babbage, Curie, and Davinci correspond to approximately 350 million, 1.3 billion, 6.7 billion, and 175 billion parameters, respectively. While not official figures, these estimates provide a strong indication of the scale of OpenAI's deployed models. AI

    On the Sizes of OpenAI API Models
  35. Evaluating Different Fewshot Description Prompts on GPT-3

    Researchers at EleutherAI investigated how different few-shot description prompts affect GPT-3's performance on the SST benchmark. Their experiments revealed that smaller GPT-2 models performed poorly and inconsistently, with performance not strictly increasing with model size. Surprisingly, the study found no correlation between different GPT models regarding which prompts yielded the best results, challenging the expectation that similar models would favor similar prompting strategies. AI

    Evaluating Different Fewshot Description Prompts on GPT-3
  36. Finetuning Models on Downstream Tasks

    Researchers at EleutherAI explored the impact of fine-tuning the GPT-Neo 2.7B model on a diverse set of downstream tasks. They observed that while the fine-tuned model did not universally outperform the base model, it showed significant improvements on certain tasks like ANLI. However, this specialization came at the cost of performance degradation on tasks not included in the fine-tuning set, such as LAMBADA and PubMedQA, indicating a potential for catastrophic forgetting. AI

    Finetuning Models on Downstream Tasks
  37. Generating "hunches" using smart home data 🏠

    Amazon has developed a new feature for Alexa called "hunches" that uses complex smart home data to anticipate user needs. This system synthesizes disparate data from various devices and configurations, even amidst anomalies like those seen during a pandemic. The goal is to create a more intuitive and proactive smart home experience for users. AI

    Generating "hunches" using smart home data 🏠
  38. Next-gen voice assistants

    PolyAI CEO Nikola Mrkšić discussed advancements in conversational AI and the development of next-generation voice assistants capable of human-level conversations. The company's ConveRT model has demonstrated superior performance compared to BERT and GPT-based models in evaluations, particularly in understanding various languages and accents. PolyAI's technology aims to enhance customer service interactions through more sophisticated voice assistant capabilities. AI

    Next-gen voice assistants
  39. Understanding BigBird's Block Sparse Attention

    BigBird is a novel attention mechanism designed to address the quadratic complexity of standard Transformer models. It achieves this by employing a sparse attention pattern, which includes global, window, and random attention, allowing it to process significantly longer sequences than traditional Transformers. This innovation makes BigBird particularly effective for tasks requiring long-range dependencies, such as document summarization and question answering on extensive texts. AI

    Understanding BigBird's Block Sparse Attention
  40. GPT-3 powers the next generation of apps

    OpenAI has announced that over 300 applications are now leveraging its GPT-3 API to provide advanced AI features. These applications span various sectors, including productivity, education, and gaming, demonstrating GPT-3's versatility in tasks like search, conversation, and text completion. Companies such as Viable, Fable Studio, and Algolia are highlighted for their innovative uses of GPT-3, with Algolia reporting significant improvements in search accuracy compared to previous models. AI

    GPT-3 powers the next generation of apps
  41. Reducing Toxicity in Language Models

    OpenAI has shared insights gained from deploying its language models, highlighting that real-world misuse often differs from initial fears. The company emphasized the limitations of current evaluation methods and the need for novel benchmarks to address safety concerns. OpenAI also noted that basic safety research significantly enhances the commercial utility of AI systems. AI

    Reducing Toxicity in Language Models
  42. Fine-Tune W2V2-Bert for low-resource ASR with 🤗 Transformers

    Hugging Face has released a series of blog posts detailing how to fine-tune various Wav2Vec2 and Whisper models for Automatic Speech Recognition (ASR) tasks using their Transformers library. These guides cover adapting models for low-resource scenarios, multilingual applications, and specific languages like English. The tutorials emphasize practical implementation for researchers and developers working with speech data. AI

    Fine-Tune W2V2-Bert for low-resource ASR with 🤗 Transformers
  43. Replit Case Study - Catalyst Coding Club

    Replit has launched Agent v2, an enhanced AI coding assistant that offers greater autonomy and a real-time application design preview. This new version is designed to be less prone to errors and more efficient in generating user interfaces. The update is available to paid Replit users through an early access program, with further features planned for release in the coming weeks. Replit also introduced Replit Projects, a beta feature for teams to collaborate on codebases with version control and merging capabilities, aiming to streamline the development process. AI

    Replit Case Study - Catalyst Coding Club

    IMPACT Enhances developer productivity and collaboration through AI-powered coding assistance and project management tools.

  44. Hugging Face Reads, Feb. 2021 - Long-range Transformers

    This blog post from Hugging Face discusses the advancements in long-range Transformers, a type of neural network architecture. It explores how these models are being developed to handle longer sequences of text, overcoming previous limitations. The post likely delves into the technical aspects and potential applications of these more capable Transformer models. AI

    Hugging Face Reads, Feb. 2021 - Long-range Transformers
  45. Multimodal neurons in artificial neural networks

    OpenAI researchers have identified "multimodal neurons" within their CLIP model, which respond to concepts regardless of whether they are presented visually, symbolically, or textually. This discovery offers insight into how CLIP achieves high accuracy on challenging datasets by abstracting concepts, similar to how neurons in the human brain function. The findings suggest a common mechanism for abstraction in both artificial and natural vision systems, potentially explaining model versatility and compactness. AI

    Multimodal neurons in artificial neural networks
  46. Quick, beautiful web UIs for ML apps

    The Machine Learning Compilation (MLC) group, led by Tianqi Chen at CMU, is developing frameworks like MLC Chat and Web LLM to enable running large language models on consumer hardware, including iPhones and web browsers. This initiative aims to mitigate the current GPU shortage by allowing models to run locally on devices with AMD cards or even just CPUs. Projects like Hugging Face's text-to-webapp generator and Gradio are also contributing to easier deployment and accessibility of ML models for developers and end-users. AI

    Quick, beautiful web UIs for ML apps
  47. Fit More and Train Faster With ZeRO via DeepSpeed and FairScale

    Hugging Face has integrated ZeRO (Zero Redundancy Optimizer) into its libraries, leveraging DeepSpeed and FairScale. This enhancement allows for more efficient training of large language models by reducing memory redundancy across distributed training setups. The optimization enables fitting larger models into memory and accelerating the training process. AI

    Fit More and Train Faster With ZeRO via DeepSpeed and FairScale
  48. Improving Recommendation Systems & Search in the Age of LLMs

    A new paper explores the critical role of user state representation in contextual multi-armed bandit (CMAB) recommender systems, finding that variations in state representation can yield greater performance improvements than changes to the bandit algorithm itself. The research highlights that no single embedding or aggregation strategy is universally superior, emphasizing the need for domain-specific evaluations. Another study introduces BEAR, a novel fine-tuning objective for Large Language Models (LLMs) in recommendation tasks that explicitly accounts for beam search behavior during training to address inconsistencies between training and inference. Additionally, a paper proposes a methodology to measure the stability and plasticity of recommender systems, evaluating how models adapt to retraining and changes in data patterns. AI

    Improving Recommendation Systems & Search in the Age of LLMs

    IMPACT Advances in user state representation and LLM fine-tuning for recommendations could lead to more personalized and effective user experiences.

  49. CLIP: Connecting text and images

    OpenAI has introduced CLIP, a neural network designed to learn visual concepts from natural language supervision. This model can perform a wide range of image classification tasks without specific training for each benchmark, leveraging the vast amount of text paired with images available online. CLIP aims to overcome limitations of traditional computer vision models, such as the cost of creating datasets and the narrow focus of task-specific training, by achieving robust performance across various benchmarks with zero-shot capabilities. AI

    CLIP: Connecting text and images
  50. Open Preference Dataset for Text-to-Image Generation by the 🤗 Community

    OpenAI has detailed a new method for generating images from text using CLIP latents, employing a two-stage process with a prior and a decoder. This approach enhances image diversity while maintaining photorealism and caption similarity, and allows for language-guided image manipulations. Separately, OpenAI also introduced DALL-E, a 12-billion parameter GPT-3 variant capable of creating images from text descriptions, demonstrating abilities like combining concepts and rendering text. AI

    Open Preference Dataset for Text-to-Image Generation by the 🤗 Community

    IMPACT Introduces new techniques for text-to-image generation, potentially improving diversity and controllability.