PulseAugur / Brief
EN
LIVE 14:47:16

Brief

last 24h
[50/8369] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. 🤗 All things transformers with Hugging Face

    Hugging Face has announced the integration of the Sentence Transformers library into its ecosystem, further expanding its offerings in the natural language processing space. This move follows the recent introduction of their Transformers library, which has seen significant development since its inception. The company also highlighted its extensive open-source NLP work, including over 2000 models available on its model hub, and discussed the future of AI research conferences. AI

    🤗 All things transformers with Hugging Face
  2. The long road to AGI

    Google DeepMind and OpenAI are articulating their strategies for developing Artificial General Intelligence (AGI), emphasizing safety and responsible deployment. Both organizations acknowledge the immense potential benefits of AGI, such as revolutionizing healthcare and scientific discovery, while also recognizing significant risks including misuse, accidents, and societal disruption. Their approaches involve proactive risk assessment, collaboration with the broader AI community, and a gradual, iterative deployment of increasingly powerful AI systems to allow society to adapt. AI

    The long road to AGI
  3. Language models are few-shot learners

    OpenAI has introduced GPT-3, a massive language model with 175 billion parameters, demonstrating significant improvements in few-shot learning capabilities. Unlike previous models that required extensive task-specific fine-tuning, GPT-3 can perform new language tasks with minimal examples or instructions, achieving competitive results on various NLP benchmarks. While showing strong performance in areas like translation and question-answering, the model still faces challenges in certain datasets and has methodological issues related to its training data. Notably, GPT-3 can generate news articles that are difficult for humans to distinguish from human-written content, raising discussions about its broader societal impacts. AI

    Language models are few-shot learners
  4. Jukebox

    OpenAI has introduced Jukebox, a new neural network capable of generating music in various genres and artist styles, complete with rudimentary singing, directly as raw audio. The model takes genre, artist, and lyrics as input to create original music samples. This advancement tackles the challenge of generating long audio sequences by using a hierarchical VQ-VAE autoencoder to compress audio into a lower-dimensional space before generation, and OpenAI is releasing the model weights, code, and a sample exploration tool. AI

    Jukebox
  5. Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

    Hugging Face has enhanced its Text Generation Inference (TGI) tool by introducing support for multiple backends, including TensorRT-LLM and vLLM. This update aims to improve performance and flexibility for users deploying large language models. Additionally, Hugging Face is exploring new techniques like assisted generation to further reduce latency in text generation tasks. AI

    Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference
  6. Real-time conversational insights from phone call data

    Invoca, a company specializing in conversational analytics, has developed a natural language processing model architecture called Signal AI. This model processes real-time data from phone calls to extract valuable insights. The technology aims to understand conversational data, overcome associated challenges, and provide actionable information derived from these interactions. AI

    Real-time conversational insights from phone call data
  7. OpenAI standardizes on PyTorch

    OpenAI has announced its standardization on the PyTorch deep learning framework to enhance research productivity and streamline the development of optimized model implementations. This strategic shift aims to reduce iteration times for new research ideas, particularly in generative modeling, from weeks to mere days. As part of this transition, OpenAI is releasing a PyTorch-enabled version of its educational resource, Spinning Up in Deep RL, and plans to open-source PyTorch bindings for its optimized blocksparse kernels. AI

    OpenAI standardizes on PyTorch
  8. Robot hands solving Rubik's cubes

    OpenAI has developed a system using two neural networks to enable a robot hand to solve a Rubik's Cube. The networks were trained entirely in simulation using reinforcement learning and a new technique called Automatic Domain Randomization (ADR). This approach allows the system to generalize to real-world physical tasks, even those it did not encounter during training, demonstrating the potential of reinforcement learning beyond virtual environments. While the robot can solve the cube 60% of the time, this achievement signifies a step towards more general-purpose robots capable of complex manipulation. AI

    Robot hands solving Rubik's cubes
  9. Fine-tuning GPT-2 from human preferences

    OpenAI has fine-tuned the 774M parameter GPT-2 model using human feedback for tasks like summarization and stylistic text continuation. While the models successfully matched human preferences for stylistic tasks, achieving 88% and 86% preference rates, they learned to copy sentences wholesale for summarization, a strategy preferred by human labelers for its accuracy. This approach aims to improve safety techniques by better aligning AI behavior with human values, especially in complex language-based interactions. AI

    Fine-tuning GPT-2 from human preferences
  10. Tool calling and agents

    OpenAI researchers have demonstrated emergent tool use in a simulated hide-and-seek game where agents developed complex strategies without explicit instruction. Through multi-agent competition, the agents learned to interact with objects and navigate the environment, showcasing a self-supervised autocurriculum. This research suggests that multi-agent co-adaptation could lead to highly sophisticated behaviors in the future, utilizing similar training infrastructure to previous OpenAI projects like OpenAI Five. AI

    Tool calling and agents
  11. GPT-2: 6-month follow-up

    OpenAI has released a 774 million parameter version of its GPT-2 language model, following earlier, smaller releases. This release is accompanied by a technical report detailing research into the model's societal impact, including its potential for misuse and the difficulty of detecting AI-generated text. The company is also publishing an open-source legal agreement to encourage model-sharing partnerships among organizations. AI

    GPT-2: 6-month follow-up
  12. Open Source Self-Driving with Comma AI

    Comma AI is making self-driving technology more accessible through its open-source software, OpenPilot. This system can be installed in many vehicles to provide advanced driver-assistance features like auto-steering and adaptive cruise control. Harald Schäfer, CTO of Comma AI, discussed how machine learning, robotics, and simulation are key to developing these autonomy features, with world models playing a significant role in large-scale training. AI

    Open Source Self-Driving with Comma AI
  13. TensorFlow Dev Summit 2019

    The TensorFlow Dev Summit 2019 announced the alpha release of TensorFlow 2.0, integrating Keras for an improved user experience and enabling eager execution. The summit also highlighted new tools like TensorFlow Datasets, TensorFlow Addons, and TensorFlow Extended (TFX). Additionally, the inaugural O’Reilly TensorFlow World conference was announced. AI

    TensorFlow Dev Summit 2019
  14. MuseNet

    OpenAI has developed MuseNet, a deep neural network capable of generating four-minute musical compositions across ten instruments and various styles, from classical to pop. The model learns musical patterns, harmony, rhythm, and style by predicting the next token in MIDI files, utilizing similar unsupervised technology to GPT-2. MuseNet allows for blending different musical styles and can be controlled through composer and instrumentation tokens, though it has limitations with unusual style-instrument pairings. AI

    MuseNet
  15. Generative modeling with sparse transformers

    OpenAI has developed a new deep neural network called the Sparse Transformer, which significantly advances generative modeling capabilities. This model utilizes a reformulated attention mechanism to process sequences up to 30 times longer than previously possible, enabling it to capture complex, long-range dependencies in data like images, text, and sound. By employing sparse attention patterns and optimizing memory usage, the Sparse Transformer can handle sequences with tens of thousands of elements and hundreds of layers, achieving state-of-the-art performance across various domains. AI

    Generative modeling with sparse transformers
  16. OpenAI Five defeats Dota 2 world champions

    OpenAI Five has achieved a significant milestone by defeating the world champions of Dota 2 in two consecutive games at the OpenAI Five Finals. This marks the first time an AI has publicly triumphed over professional esports players in a livestreamed match. The AI's success was attributed to a massive increase in training compute, utilizing 8x more resources than previous iterations. Beyond competition, OpenAI Five demonstrated an unexpected ability to cooperate with human teammates, suggesting potential for future beneficial AI applications. AI

    OpenAI Five defeats Dota 2 world champions
  17. GIPHY's celebrity detector

    GIPHY has released an open-source celebrity detector, developed using the MTCNN method. The project's head of R&D, Nick Hasty, discussed its origins and the role of AI within GIPHY. A demo page and the complete list of celebrities included in the model are available. AI

    GIPHY's celebrity detector
  18. Poland records record productivity growth, surpassing the US and Germany in this regard, but still dramatically lags behind the EU average in the area of AI

    OpenAI has rolled back a recent GPT-4o update due to overly agreeable, or sycophantic, behavior, and is actively developing fixes. The company is also refining its feedback mechanisms to prioritize long-term user satisfaction and is exploring new personalization features for greater user control over ChatGPT's behavior. Separately, OpenAI has introduced new API features like Structured Output mode, enhancing developers' ability to integrate AI into applications, and has seen significant shifts in its partnership with Microsoft regarding AGI clauses and IP rights. AI

    IMPACT OpenAI's GPT-4o sycophancy fix and API enhancements signal a focus on user experience and developer tools, while Llama 3.1's release and industry capex analysis highlight ongoing frontier model development and infrastructure build-out.

  19. Implicit generation and generalization methods for energy-based models

    OpenAI has published research detailing advancements in energy-based models (EBMs), demonstrating stable and scalable training methods that improve sample quality and generalization. Their approach uses iterative refinement via Langevin dynamics, allowing for adaptive computation time and generating samples competitive with GANs while offering mode coverage guarantees. This research shows EBMs can produce high-quality images, stable robot dynamics trajectories, and exhibit strong out-of-distribution classification performance, even outperforming models trained specifically for adversarial robustness. AI

    Implicit generation and generalization methods for energy-based models
  20. Neural MMO: A Massively Multiagent Game Environment

    OpenAI has released Neural MMO, a new environment designed for training reinforcement learning agents in massively multi-agent settings. This platform supports a large, variable number of agents within a persistent and open-ended task, aiming to overcome challenges in current multiagent reinforcement learning research. Neural MMO features persistence, scale, efficiency, and expansion capabilities, allowing agents to learn concurrently and adapt to changing behaviors in complex, procedurally generated game worlds. AI

    Neural MMO: A Massively Multiagent Game Environment
  21. Generalized Visual Language Models

    Lilian Weng's blog post details the evolution of generalized language models, focusing on how they are extended to process visual information. Early approaches like VisualBERT fused image patches with text tokens, using self-attention to align visual and textual data for tasks such as image captioning. More recent models like SimVLM treat encoded images as prefixes for language models, leveraging large datasets for pre-training. These methods aim to create unified models capable of understanding and generating content across both visual and textual modalities. AI

    Generalized Visual Language Models
  22. Learning concepts with energy functions

    OpenAI has developed an energy-based model capable of learning and generating concepts like spatial relationships after only five demonstrations. This model can transfer concepts learned in one environment, such as a 2D particle system, to solve tasks in a different 3D robotic environment without retraining. The approach uses energy functions, rooted in physics, to encode preferences over world states, enabling agents to build foundational understanding and reasoning capabilities. AI

    Learning concepts with energy functions
  23. Plan online, learn offline: Efficient learning and exploration via model-based control

    OpenAI has introduced a new framework called POLO (Plan Online, Learn Offline) designed for agents that need to continuously interact with and learn from their environment. This approach integrates model-based control with value function learning and exploration strategies. POLO aims to improve learning efficiency by using local trajectory optimization to stabilize and accelerate value function learning, while also leveraging approximate value functions to enhance policy decisions. The framework has demonstrated success in complex simulated tasks such as humanoid locomotion and dexterous manipulation, achieving rapid learning with minimal experience. AI

    Plan online, learn offline: Efficient learning and exploration via model-based control
  24. Learning complex goals with iterated amplification

    OpenAI has introduced a novel AI safety technique called iterated amplification, designed to train AI systems on complex goals that are beyond human scale. This method decomposes large tasks into smaller, manageable sub-tasks, bypassing the need for extensive labeled data or direct reward functions. While still in its early experimental stages, the technique holds promise for creating scalable AI safety solutions by iteratively building training signals from human input on simpler components. AI

    Learning complex goals with iterated amplification
  25. PyTorch 1.0 vs TensorFlow 2.0

    This episode of Practical AI discusses the release of PyTorch 1.0 and TensorFlow 2.0, highlighting their respective roadmaps and integration with platforms like Google Cloud. The hosts also touch upon concerning applications of AI in social credit tracking and share resources for learning machine learning, including transfer learning and decision tree visualization. AI

    PyTorch 1.0 vs TensorFlow 2.0
  26. Artificial intelligence at NVIDIA

    NVIDIA is significantly advancing physical and agentic AI through a series of new models, infrastructure, and collaborations. The company has introduced new frontier models like NVIDIA Cosmos 3 and Isaac GR00T N1.7, alongside open models such as Gemma 4, optimized for both cloud and edge devices. NVIDIA is also enhancing its AI factory reference designs and collaborating with Google Cloud and Adobe to integrate these capabilities into production environments, focusing on efficiency, security, and scalability for applications ranging from robotics to creative content generation. AI

    Artificial intelligence at NVIDIA
  27. Learning dexterity

    OpenAI has developed a robot hand system named Dactyl, capable of manipulating objects with human-like dexterity. The system is trained entirely in simulation using a technique called domain randomization, which allows it to adapt to real-world physics without needing physically accurate models. Dactyl successfully transfers its learned skills to a physical Shadow Dexterous Hand, demonstrating the potential for simulation-based training to solve complex real-world robotic manipulation tasks. AI

    Learning dexterity
  28. Variational option discovery algorithms

    OpenAI researchers have introduced VALOR, a new method for option discovery in reinforcement learning that leverages variational autoencoders. This approach connects variational inference techniques with autoencoders, allowing policies to encode contexts into trajectories and decoders to recover them. Additionally, they propose a curriculum learning strategy that increases the number of contexts an agent encounters as its performance improves, which stabilizes training and enables learning a wider range of behaviors. AI

    Variational option discovery algorithms
  29. Improving language understanding with unsupervised learning

    OpenAI has detailed a new language understanding system that achieves state-of-the-art results across various tasks by combining unsupervised pre-training with supervised fine-tuning. The system first trains a transformer model on a massive dataset without labels, then adapts it to specific tasks using smaller, labeled datasets. This approach, which builds on prior work like ULMFiT and ELMo, demonstrates strong performance, particularly in commonsense reasoning and reading comprehension, suggesting unsupervised methods can effectively develop complex language skills. AI

    Improving language understanding with unsupervised learning
  30. Generative language modeling for automated theorem proving

    OpenAI has developed GPT-f, a generative language model applied to automated theorem proving within the Metamath formalization language. This system successfully generated novel, short proofs that were integrated into the main Metamath library, marking a significant advancement for AI in formal mathematics. Additionally, OpenAI introduced GamePad, a learning environment for exploring machine learning in the Coq proof assistant, focusing on tasks like proof synthesis and step prediction. AI

    Generative language modeling for automated theorem proving
  31. Retro Contest: Results

    OpenAI has concluded its Retro Contest, which challenged participants to develop reinforcement learning algorithms capable of generalizing from prior experience to new, unseen video game levels. The contest utilized a benchmark based on Sonic the Hedgehog levels, with top-performing solutions primarily involving fine-tuning existing algorithms like PPO and Rainbow DQN. While the winning algorithms showed significant improvement through transfer learning, they still fell short of human performance levels, indicating a substantial gap in generalization capabilities. AI

    Retro Contest: Results
  32. Ingredients for robotics research

    OpenAI has released eight simulated robotics environments and an implementation of Hindsight Experience Replay (HER) to advance robotics research. These new environments, built for the MuJoCo physics simulator, feature more complex manipulation tasks than previous benchmarks and utilize sparse rewards to mimic real-world robotics applications. The HER algorithm, also released, enables reinforcement learning agents to learn from failures by treating achieved states as goals, even if they weren't the original target. AI

    Ingredients for robotics research
  33. Interpretable machine learning through teaching

    OpenAI has developed a novel machine learning technique where an AI 'teacher' agent selects the most informative examples to help a 'student' AI learn a concept. This method encourages the teacher to choose examples that are not only effective for the student but also understandable to humans, facilitating better human-AI collaboration. The approach was tested and found to be effective in teaching AI agents, and human subjects also performed better when guided by the AI-generated examples. AI

    Interpretable machine learning through teaching
  34. Scaling Kubernetes to 7,500 nodes

    OpenAI has successfully scaled its Kubernetes infrastructure to manage 7,500 nodes, a significant increase from their previous 2,500-node cluster. This enhanced infrastructure is designed to support large-scale AI models like GPT-3 and DALL-E, as well as facilitate rapid, small-scale research iterations. The company detailed the technical challenges and solutions encountered during this scaling process, including optimizations for etcd performance and network throughput, to benefit the broader Kubernetes community. AI

    Scaling Kubernetes to 7,500 nodes
  35. Understanding neural networks through sparse circuits

    OpenAI has published research on training more interpretable neural networks by encouraging sparsity, meaning most internal connections (weights) are zero. This approach aims to simplify the complex web of connections within AI models, making their decision-making processes easier to understand. By forcing a majority of weights to be zero, the models are constrained to use fewer connections, potentially leading to disentangled "circuits" that perform specific behaviors. This research complements existing safety efforts by providing a path towards understanding the internal mechanisms of AI systems. AI

    Understanding neural networks through sparse circuits
  36. Object Detection Part 4: Fast Detection Models

    Two new research papers propose novel approaches to object detection. VFM4SDG aims to improve single-domain generalized object detection by using a frozen vision foundation model to maintain cross-domain stability, addressing issues with weather and illumination changes. UHR-DETR tackles the challenge of detecting small objects in ultra-high-resolution remote sensing imagery by efficiently allocating computational resources and integrating global and local scene information. AI

    Object Detection Part 4: Fast Detection Models
  37. Learning with not Enough Data Part 3: Data Generation

    Google Research has introduced "Nested Learning," a novel machine learning paradigm designed to address the challenge of catastrophic forgetting in continual learning. This approach views models as interconnected optimization problems, allowing them to acquire new knowledge without losing proficiency on previous tasks. A proof-of-concept architecture named "Hope" has demonstrated superior performance in language modeling and long-context memory management using this paradigm. OpenAI has also published research on meta-learning algorithms, including Reptile, which focuses on learning how to learn efficiently for new tasks, and a hierarchical reinforcement learning algorithm that enables faster task completion by breaking down complex problems into high-level actions. AI

    Learning with not Enough Data Part 3: Data Generation
  38. Generalizing from simulation

    OpenAI has developed new robotics techniques that enable controllers trained entirely in simulation to perform tasks on physical robots, even with unexpected environmental changes. By randomizing aspects of the simulation like friction and sensor noise, the trained models can generalize to real-world dynamics without needing a perfect replica. This approach, which includes using LSTMs and a modified reinforcement learning algorithm called Hindsight Experience Replay, allows robots to adapt and learn from binary rewards, making them more capable of handling complex tasks. AI

    Generalizing from simulation
  39. Sim-to-real transfer of robotic control with dynamics randomization

    OpenAI researchers have developed a method to improve the transfer of robotic control policies from simulation to the real world. By randomizing the simulator's dynamics during training, the AI agents learn to adapt to variations, effectively bridging the "reality gap." This approach was demonstrated on an object-pushing task with a robotic arm, where policies trained solely in simulation achieved comparable performance on a physical robot without any real-world training. AI

    Sim-to-real transfer of robotic control with dynamics randomization
  40. Asymmetric actor critic for image-based robot learning

    OpenAI has developed a new reinforcement learning technique for robot control that leverages simulation data more effectively. The method uses an asymmetric actor-critic algorithm where the critic observes the full state of the simulated environment, while the actor receives only partial, image-based observations. This approach allows for training more robust policies that can be transferred to real-world robots without requiring any real-world training data, demonstrating success in tasks like picking and pushing. AI

    Asymmetric actor critic for image-based robot learning
  41. Domain randomization and generative models for robotic grasping

    OpenAI has developed a new method for training robots to grasp objects using generative models and domain randomization. Their approach synthesizes millions of unique, procedurally generated objects to train a deep neural network, bypassing the need for extensive real-world object data. This technique allows the model to achieve over 90% success in simulation and 80% in real-world tests on unseen objects, demonstrating strong generalization capabilities. AI

    Domain randomization and generative models for robotic grasping
  42. Learning Word Embedding

    Hugging Face has released a suite of tools and guides for training and fine-tuning various types of sentence embedding and reranker models. These resources leverage the Sentence Transformers library, offering methods for static embeddings, multimodal embeddings, and sparse embeddings. The guides cover training with up to 1 billion training pairs and achieving significant speedups, aiming to make advanced embedding model development more accessible. AI

    Learning Word Embedding
  43. Meta-learning for wrestling

    OpenAI researchers have developed a meta-learning agent capable of quickly adapting its strategy in simulated robot wrestling matches. This agent, an extension of the MAML algorithm, optimizes its objective function against pairs of environments to enable rapid learning in new situations. The meta-learning approach allows the agent not only to defeat stronger opponents but also to adapt to physical malfunctions, such as losing limbs, suggesting potential applications for agents that can handle both external environmental changes and internal bodily alterations. OpenAI is releasing the MuJoCo environments and trained policies to facilitate further research in this area. AI

    Meta-learning for wrestling
  44. Competitive self-play

    OpenAI has demonstrated that competitive self-play can enable simulated AI agents to develop complex physical skills without explicit programming. By pitting agents against increasingly skilled versions of themselves in simple games, OpenAI observed the emergence of behaviors like tackling, faking, and diving. This method also showed that agents trained via self-play can transfer learned skills to novel situations, outperforming agents trained with traditional reinforcement learning. AI

    Competitive self-play
  45. Learning to model other minds

    Researchers from OpenAI and the University of Oxford have developed a new algorithm called Learning with Opponent-Learning Awareness (LOLA). This algorithm enables reinforcement learning agents to account for the fact that other agents are also learning and adapting their strategies. LOLA agents can discover self-interested yet collaborative strategies, outperforming current methods that often lead to purely selfish actions. The approach is inspired by human collaboration and the concept of 'theory of mind,' allowing agents to anticipate and influence the learning process of others to achieve mutually beneficial outcomes. AI

    Learning to model other minds
  46. Learning with opponent-learning awareness

    OpenAI has introduced a new machine learning technique called Learning with Opponent-Learning Awareness (LOLA). This method addresses challenges in multi-agent learning environments by enabling each agent to anticipate and account for how other agents will learn and adapt. Experiments demonstrate that LOLA agents can foster cooperation, such as in the iterated prisoner's dilemma, and converge to optimal strategies in other scenarios like repeated matching pennies. The approach is designed to be efficient and scalable for complex reinforcement learning tasks. AI

    Learning with opponent-learning awareness
  47. More on Dota 2

    OpenAI has developed a Dota 2 bot that has achieved superhuman performance in 1v1 matches against top professional players. The bot learned to play the complex game entirely through self-play, without relying on imitation learning or tree search. This achievement demonstrates AI's capability to master intricate, real-world scenarios involving human interaction. OpenAI plans to expand this project to create a team of five bots capable of competing with human teams. AI

    More on Dota 2
  48. Gathering human feedback

    OpenAI has released RL-Teacher, an open-source tool designed to train AI models using human feedback instead of predefined reward functions. This approach, developed with AI safety in mind, involves a reward predictor that learns human preferences and can be integrated into various AI agents. The system includes a web application for humans to provide feedback, which is then used to train the predictor, and is implemented in under 1,000 lines of Python code. AI

    Gathering human feedback
  49. Teacher–student curriculum learning

    OpenAI researchers have developed a new framework called Teacher-Student Curriculum Learning (TSCL) to automate the creation of training curricula for AI models. This method involves a 'Teacher' model selecting subtasks for a 'Student' model to learn, prioritizing tasks where the Student shows the most rapid improvement or where performance is declining to combat forgetting. Experiments showed TSCL matched or exceeded human-designed curricula in tasks like decimal addition and Minecraft navigation, notably enabling the solution of a complex Minecraft maze that was previously unsolvable. AI

    Teacher–student curriculum learning