PulseAugur / Brief
EN
LIVE 22:27:35

Brief

last 24h
[50/5166] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Insulin resistance prediction from wearables and routine blood biomarkers

    Google Research has developed a machine learning model capable of predicting insulin resistance using data from wearable devices and routine blood tests. This novel approach aims to provide a more accessible and scalable method for early screening of type 2 diabetes risk, which is often preceded by insulin resistance. The system demonstrated strong performance in predicting insulin resistance, particularly in individuals at higher risk, and includes an AI agent built on Gemini LLMs to help users understand their risk and potential next steps. AI

    Insulin resistance prediction from wearables and routine blood biomarkers
  2. Highly accurate genome polishing with DeepPolisher: Enhancing the foundation of genomic research

    Google Research has introduced DeepPolisher, an open-source deep learning tool designed to significantly enhance the accuracy of genome assemblies. Developed in collaboration with UC Santa Cruz Genomics Institute, this method reduces base-level errors by 50% and insertion/deletion errors by 70%. These improvements are crucial for more accurate gene identification and disease variant detection, building upon existing genome assembly techniques. AI

    Highly accurate genome polishing with DeepPolisher: Enhancing the foundation of genomic research
  3. NIST’s ‘Living Reference Material’ Could Accelerate R&D of Lifesaving Biological Drugs

    NIST has developed two new reference materials to enhance quality control in the biopharmaceutical industry. One is a living reference material, NISTCHO, which produces monoclonal antibodies and allows manufacturers to optimize their production processes. The other is a standard reference material, SRM 1989, consisting of precisely sized particles designed to help detect and quantify impurities in protein-based drugs. These materials aim to accelerate drug development, ensure drug safety and efficacy, and reduce manufacturing costs by improving the accuracy and uniformity of quality control measures. AI

    NIST’s ‘Living Reference Material’ Could Accelerate R&D of Lifesaving Biological Drugs
  4. Synthetic and federated: Privacy-preserving domain adaptation with LLMs for mobile applications

    Google AI researchers have developed a privacy-preserving method for adapting large language models (LLMs) for mobile applications, specifically enhancing the Gboard typing experience. This approach utilizes synthetic data generated by LLMs and federated learning with differential privacy to train models without compromising user data. The techniques have already been implemented in Gboard, improving typing predictions and error correction, and all production LLMs trained on user data now incorporate these privacy guarantees. AI

    Synthetic and federated: Privacy-preserving domain adaptation with LLMs for mobile applications
  5. Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG

    Researchers are developing advanced agent frameworks to improve AI reliability and efficiency across various domains. Google introduced an agentic RAG system that enhances enterprise query handling by iteratively searching for complete context, boosting accuracy by up to 34%. Hugging Face demonstrated a multi-agent economy simulation using a small 3B model, highlighting the trade-offs between model size and real-time performance. Other research explores methods for reliable tool use, regulatory compliance through agent-to-agent protocols, dynamic benchmarking for agent behavior, and robust self-evolution mechanisms for AI agents. AI

    Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG

    IMPACT New agentic frameworks and evaluation methods promise more reliable, efficient, and compliant AI systems across enterprise, simulation, and regulatory domains.

  6. Measuring heart rate with consumer ultra-wideband radar

    Google Research has developed a new method for measuring heart rate using ultra-wideband (UWB) radar, a technology already present in many mobile devices. This approach leverages transfer learning, adapting models trained on different radar types to accurately detect the subtle chest movements associated with a heartbeat. The research demonstrates that UWB's radar capabilities, previously underutilized for sensing, can be effectively employed for vital sign monitoring, potentially enabling contactless heart rate tracking in everyday electronics. AI

    Measuring heart rate with consumer ultra-wideband radar
  7. Android Earthquake Alerts: A global system for early warning

    Google has developed a global earthquake detection and early warning system utilizing the accelerometers in Android smartphones. This system, detailed in a Science publication, leverages data from millions of devices to identify seismic activity and issue alerts before the most damaging waves arrive. Since its rollout in 2021, the system has detected over 18,000 earthquakes and sent out 790 million alerts across nearly 100 countries, significantly expanding access to early warning systems. AI

    Android Earthquake Alerts: A global system for early warning
  8. AI Video Is Eating The World — Olivia and Justine Moore, a16z

    AI video generation is rapidly advancing, moving beyond simple animated images to more sophisticated capabilities. Google's Veo 3 model now includes native audio generation, streamlining the creation process by eliminating the need for separate lip-syncing and sound effect tools. This development is enabling new types of AI video creators, particularly for content popular on platforms like TikTok and YouTube, often categorized as 'kids' or 'brainrot' content. AI

    AI Video Is Eating The World — Olivia and Justine Moore, a16z
  9. Finding Nemotron

    NVIDIA has released two new open-source AI models: Nemotron, which focuses on reasoning capabilities, and Parakeet, a speech model. These models are built upon Meta's Llama architecture, with Nemotron Ultra achieving high accuracy in reasoning tasks. NVIDIA emphasizes the value of open foundation models for enterprises seeking to deploy multi-model strategies, highlighting reasoning as a key differentiator in real-world AI applications. AI

    Finding Nemotron
  10. Grammarly acquires Superhuman

    Grammarly has acquired the email startup Superhuman, signaling a strategic move to enhance its AI platform. The acquisition aims to integrate Superhuman's advanced AI capabilities into Grammarly's existing offerings, potentially expanding its reach into new communication tools and workflows. AI

    IMPACT This acquisition could lead to more integrated AI-powered communication tools, enhancing productivity for users.

  11. OpenAI releases Deep Research API (o3/o4-mini)

    OpenAI has launched a new Deep Research API, offering access to their o3 and o4-mini models. This API is intended for researchers to explore and experiment with OpenAI's latest advancements. The release aims to foster innovation and deeper understanding within the AI research community. AI

  12. ByteDance Introduces Astra: A Dual-Model Architecture for Autonomous Robot Navigation

    ByteDance has introduced Astra, a novel dual-model architecture designed to enhance autonomous robot navigation in complex indoor environments. The system employs a System 1/System 2 approach, with Astra-Global handling low-frequency tasks like multimodal localization and Astra-Local managing high-frequency path planning. Astra-Global functions as a Multimodal Large Language Model, utilizing a hybrid topological-semantic graph built from offline mapping to process visual and linguistic inputs for precise positioning. AI

    ByteDance Introduces Astra: A Dual-Model Architecture for Autonomous Robot Navigation
  13. Apple executives have held internal talks about buying Perplexity

    Apple executives have reportedly held preliminary discussions regarding the potential acquisition of AI startup Perplexity AI. These talks, involving key figures like Adrian Perica and Eddy Cue, are aimed at bolstering Apple's AI capabilities and talent pool. The discussions are in their nascent stages and may not result in a formal offer. AI

    Apple executives have held internal talks about buying Perplexity

    IMPACT Potential acquisition could significantly boost Apple's AI integration and competitive standing.

  14. The Quiet Rise of Claude Code vs Codex

    Anthropic's Claude models are demonstrating strong performance in code generation tasks, potentially rivaling OpenAI's Codex. While not explicitly a product release, this development highlights the increasing capabilities of large language models in specialized areas like programming. The comparison suggests a competitive landscape where AI is rapidly advancing in assisting developers. AI

  15. How we're supporting better tropical cyclone prediction with AI

    Google DeepMind has launched Weather Lab, an interactive website showcasing its experimental AI-powered tropical cyclone prediction model. This new model, based on stochastic neural networks, can forecast a cyclone's formation, track, intensity, size, and shape up to 15 days in advance, generating 50 potential scenarios. Internal testing indicates its predictions are as accurate as, and often surpass, current physics-based methods, and it is being used in partnership with the U.S. National Hurricane Center to aid their forecasts and warnings. AI

    How we're supporting better tropical cyclone prediction with AI
  16. Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

    Nvidia has released Isaac GR00T N1.5, a new foundation model for robotics. This model is specifically tuned for the LeRobot SO-101 arm, enhancing its capabilities. The release focuses on improving the performance and adaptability of robotic systems in various tasks. AI

    Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm
  17. Apple exposes Foundation Models API and... no new Siri

    Apple has introduced a new API for Foundation Models, allowing developers to integrate large language models into their applications. This move signals Apple's entry into the generative AI space, offering tools for on-device and cloud-based AI processing. However, the company did not unveil a significantly updated version of its Siri voice assistant, disappointing some observers. AI

  18. ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

    Hugging Face has released ScreenSuite, a new evaluation suite designed to comprehensively assess the performance of GUI agents. This suite aims to provide a standardized method for testing how well these agents can understand and interact with graphical user interfaces. The goal is to drive progress in the development of more capable and reliable AI agents that can operate within visual environments. AI

    ScreenSuite - The most comprehensive evaluation suite for GUI Agents!
  19. Real-Time AI Sound Generation on Arm: A Personal Tool for Creative Freedom

    Hugging Face has released a new tool that enables real-time AI sound generation directly on Arm-based devices. This innovation allows for greater creative freedom by bringing advanced audio synthesis capabilities to personal hardware. The tool aims to democratize AI sound creation, making it more accessible for artists and developers. AI

    Real-Time AI Sound Generation on Arm: A Personal Tool for Creative Freedom
  20. Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

    Researchers have introduced A11y-Compressor, a framework designed to make GUI agent observations more efficient by transforming linearized accessibility trees into structured representations. This method reduces input tokens significantly while improving task success rates. Concurrently, a new benchmark called WindowsWorld has been developed to evaluate GUI agents on complex, multi-application professional workflows, revealing current agents' poor performance in such scenarios. Additionally, VLAA-GUI offers a modular framework to address challenges like early stopping and repetitive loops in autonomous GUI agents, incorporating components for verification, loop breaking, and online search. AI

    Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

    IMPACT New benchmarks and frameworks are emerging to push the capabilities of GUI agents in complex, real-world scenarios.

  21. Federated learning in production (part 2)

    This podcast episode delves into the practical aspects of implementing federated learning systems in enterprise environments, focusing on the Flower framework. The discussion highlights the architecture of Flower, including its supernodes and superlinks, and how it is designed for real-world application across data silos. The conversation also touches upon the impact of the generative AI boom on the future development roadmap of federated learning technologies. AI

    Federated learning in production (part 2)
  22. CodeAgents + Structure: A Better Way to Execute Actions

    Hugging Face has introduced CodeAgents, a new framework designed to enhance the execution of actions within large language models. This system integrates structured outputs with code generation capabilities, aiming to improve reliability and control. The goal is to enable LLMs to more effectively interact with external tools and environments by providing a more organized approach to action planning and execution. AI

    CodeAgents + Structure: A Better Way to Execute Actions
  23. OpenAI to buy AI startup from Jony Ive

    OpenAI is reportedly acquiring Jony Ive's AI startup, io, for approximately $6.5 billion in an all-stock transaction. This move marks OpenAI's significant entry into hardware development, aiming to create new AI-powered devices. The acquisition also brings Ive, known for his work on iconic Apple products like the iPhone, and his team of designers into OpenAI. AI

    OpenAI to buy AI startup from Jony Ive

    IMPACT Signals a major AI lab's strategic push into consumer hardware, potentially reshaping the landscape of AI-powered devices.

  24. Together AI acquires Refuel.ai to unlock data for developers and businesses building production-grade AI applications

    Together AI has acquired Refuel.ai, a company specializing in data cleaning and structuring for AI applications. This acquisition aims to integrate Refuel.ai's models and platform into Together AI's existing infrastructure, enhancing its AI Acceleration Cloud. The combined capabilities are intended to help enterprises overcome data challenges and accelerate the development of production-grade AI applications. AI

    Together AI acquires Refuel.ai to unlock data for developers and businesses building production-grade AI applications

    IMPACT Enhances enterprise AI development by addressing data quality and structuring challenges, potentially accelerating production deployments.

  25. LeRobot v0.5.0: Scaling Every Dimension

    Hugging Face has released LeRobot v0.5.0, a significant update to its robotics simulation and data platform. This release focuses on scaling across multiple dimensions, including model size, dataset volume, and simulation complexity. Alongside the platform update, Hugging Face also launched LeRobotDataset v3.0, which introduces large-scale datasets designed to accelerate robotics research and development. AI

    LeRobot v0.5.0: Scaling Every Dimension
  26. Seeing beyond the scan in neuroimaging

    This podcast episode discusses the application of AI and machine learning in neuroimaging, specifically for diagnosing conditions like epilepsy. Dr. Gavin Winston explains how AI can analyze MRI data to detect subtle abnormalities that might be missed by human observation. The conversation also touches upon the challenges and ethical considerations of integrating AI into medical practices and its potential to revolutionize diagnostic workflows. AI

    Seeing beyond the scan in neuroimaging
  27. Supercharge your OCR Pipelines with Open Models

    Hugging Face has released new resources to enhance Optical Character Recognition (OCR) pipelines using open models. One blog post details how to integrate these models for improved OCR performance, while another focuses on the specific finetuning process of the olmOCR model to function as a dedicated OCR engine. These guides aim to empower developers with more efficient and adaptable OCR solutions. AI

    Supercharge your OCR Pipelines with Open Models
  28. Grok 3 & 3-mini now API Available

    Smol AI has announced the API availability of its Grok 3 and Grok 3-mini models. This release allows developers to integrate these models into their applications. The announcement was made via their newsletter. AI

  29. Zhipu.AI’s Open-Source Power Play: Blazing-Fast GLM Models & Global Expansion Ahead of Potential IPO

    Chinese AI company Zhipu.AI has open-sourced its latest GLM-4 and GLM-Z1 models, including a specialized "Rumination" model capable of autonomous web searching and self-verification. The GLM-Z1 inference model boasts up to eight times faster speeds than DeepSeek-R1, achieving 200 tokens per second on consumer GPUs. This release, along with the launch of an international platform Z.ai, signals Zhipu.AI's global ambitions and potential readiness for an IPO. AI

    Zhipu.AI’s Open-Source Power Play: Blazing-Fast GLM Models & Global Expansion Ahead of Potential IPO
  30. SOTA Video Gen: Veo 2 and Kling 2 are GA for developers

    Kling 2 and Veo 2, two state-of-the-art video generation models, have been released for developers. These models represent advancements in AI-powered video creation, offering new capabilities for content generation. Their availability to developers signifies a step towards broader adoption and integration of sophisticated video synthesis tools. AI

  31. Understanding Aggregate Trends for Apple Intelligence Using Differential Privacy

    Apple is advancing research in privacy-preserving machine learning and AI, hosting a workshop to discuss techniques like federated learning and differential privacy. The company is applying these methods to its upcoming Apple Intelligence features, such as Genmoji, Image Playground, and writing tools, to understand usage trends without compromising user data. Apple is also exploring the creation of synthetic data that mimics real user content to improve these features while maintaining strict privacy standards. AI

    Understanding Aggregate Trends for Apple Intelligence Using Differential Privacy

    IMPACT Apple's focus on privacy-preserving AI techniques for Apple Intelligence features may set new standards for user data protection in generative AI.

  32. Software and hardware acceleration with Groq

    The Practical AI podcast featured an episode with Dhananjay Singh from Groq, discussing advancements in AI inference and acceleration. Groq has developed a unique hardware and software platform, including their LPU (Language Processing Unit), designed to deliver significantly faster AI response times compared to traditional GPU-based solutions. Singh highlighted Groq's approach of developing the software compiler before the hardware, a departure from conventional development methods, to achieve breakthrough performance in low latency and high throughput for AI tasks. AI

    Software and hardware acceleration with Groq
  33. AI-assisted coding with GitHub's COO

    A new paper explores the limitations of automated evaluation for AI code review bots, finding that current automated methods like G-Eval and LLM-as-a-Judge show only moderate alignment with human developer labels. The study analyzed 2,604 bot-generated comments from Beko, revealing that developer actions on these comments are influenced by contextual and organizational factors, making them unreliable ground truth. This suggests that fully automating the evaluation of AI code review comments in industrial settings remains a significant challenge. AI

    AI-assisted coding with GitHub's COO

    IMPACT Highlights challenges in reliably evaluating AI code review tools, impacting their adoption and effectiveness in development workflows.

  34. Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

    OpenAI has significantly updated its voice AI capabilities, introducing "Promptable Prosody" which allows for more nuanced control over speech generation. This update also includes state-of-the-art automatic speech recognition (ASR) and semantic voice activity detection (VAD). These advancements aim to make AI-generated speech more natural and expressive. AI

  35. Open R1: How to use OlympicCoder locally for coding

    Hugging Face has released OlympicCoder, a new large language model specifically designed for code generation. This model is available for local use, allowing developers to integrate it into their workflows without relying on cloud services. The release emphasizes accessibility and ease of deployment for coding-related AI tasks. AI

    Open R1: How to use OlympicCoder locally for coding
  36. Optimizing for efficiency with IBM’s Granite

    IBM's Granite family of large language models is being developed with a focus on efficiency, particularly for edge computing applications. The strategy involves breaking down complex tasks into smaller, manageable components and co-designing models with hardware to optimize performance. This approach prioritizes efficiency gains over solely chasing benchmark scores, aiming to provide practical AI solutions for customers. AI

    Optimizing for efficiency with IBM’s Granite
  37. Open Operator, Serverless Browsers and the Future of Computer-Using Agents

    Browserbase has developed Stagehand, an open-source framework that enhances Playwright with AI-specific functionalities like action execution, structured data extraction, and page observation. This infrastructure aims to support the growing need for AI agents to interact with web content, moving beyond simple web scraping to complex task automation. The company also addresses challenges like bot detection and CAPTCHAs with a "proxy super network" and envisions future "agent authentication" for seamless AI interaction with online services. AI

    Open Operator, Serverless Browsers and the Future of Computer-Using Agents
  38. 1,000 Scientist AI Jam Session

    OpenAI, in collaboration with the U.S. Department of Energy, organized a "1,000 Scientist AI Jam Session" across nine national laboratories. This event brought together over 1,000 scientists to utilize advanced AI models, including OpenAI's o3-mini, for accelerating scientific discovery. The initiative aims to improve AI systems by incorporating feedback from scientists and strengthen U.S. leadership in AI and scientific innovation. AI

    1,000 Scientist AI Jam Session
  39. Deep research System Card

    OpenAI has released a system card detailing the safety measures implemented for its new "Deep research" capability. This agentic feature, powered by an early version of the o3 model, is designed to conduct multi-step internet research, analyze various data formats, and execute Python code. Prior to its release to Pro users, OpenAI conducted extensive safety testing, including external red teaming and risk evaluations, to mitigate potential issues like prompt injections, disallowed content, privacy concerns, and bias. AI

    Deep research System Card
  40. SmolVLM2: Bringing Video Understanding to Every Device

    Hugging Face has released SmolVLM2, a new multimodal model designed for efficient video understanding on consumer hardware. This model achieves strong performance on video question answering tasks while maintaining a small footprint, making it accessible for broader applications. SmolVLM2 is notable for its ability to process video inputs effectively without requiring specialized, high-end computing resources. AI

    SmolVLM2: Bringing Video Understanding to Every Device
  41. SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

    Hugging Face has released two new Vision-Language-Action (VLA) models, SmolVLA and pi0, designed for robot control. SmolVLA is an efficient model trained on community data from Lerobot, while pi0 and pi0-FAST are presented as VLA models suitable for general robot control tasks. These releases aim to advance the capabilities of robots in understanding and acting upon visual and language instructions. AI

    SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data
  42. Open-source DeepResearch – Freeing our search agents

    Hugging Face has launched Open-source DeepResearch, a new initiative aimed at democratizing access to advanced AI research. This project focuses on releasing powerful search agents that can be freely used and modified by the community. The goal is to accelerate innovation by providing researchers and developers with cutting-edge tools without proprietary restrictions. AI

    Open-source DeepResearch – Freeing our search agents
  43. Samsung announces it will stop selling all home appliance products in the Chinese market

    Samsung Electronics has announced it will cease sales of all home appliance products, including televisions and monitors, in the Chinese market. This decision comes in response to a rapidly changing market environment. The company has assured customers that it will continue to provide after-sales service and uphold consumer rights according to relevant laws and regulations. AI

  44. Video generation with realistic motion

    Genmo, a company focused on improving the realism of AI-generated video, has released open models that address the common issue of simplistic motion. Many current video generation tools struggle with realistic physics, particularly in character movement like walking, often resorting to basic camera pans. Genmo's work aims to provide more dynamic and lifelike motion in generated videos. AI

    Video generation with realistic motion
  45. Operator System Card

    OpenAI has released a system card detailing the safety measures for its new Computer-Using Agent (CUA) model, named Operator. This model combines GPT-4o's vision capabilities with reinforcement learning to interpret screenshots and interact with graphical user interfaces, enabling it to perform tasks like online shopping or booking reservations under user supervision. OpenAI has implemented a multi-layered safety approach, including external red teaming and risk evaluations, to address potential issues such as prompt injection attacks and model mistakes before its research preview release. AI

    Operator System Card
  46. SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

    Hugging Face has released two new, smaller versions of its SmolVLM model: a 256 million parameter version and a 500 million parameter version. These models are designed to be highly efficient and capable of running on less powerful hardware, including mobile devices. The release aims to make advanced language model technology more accessible and deployable in a wider range of applications. AI

    SmolVLM Grows Smaller – Introducing the 256M & 500M Models!
  47. AI Engineering for Art — with comfyanonymous, of ComfyUI

    ComfyUI has emerged as a powerful, node-based interface for generative image and video workflows, contrasting with simpler prompt-driven tools. Developed by comfyanonymous, it allows users to construct complex pipelines by chaining individual operations, offering greater control and efficiency for advanced AI art engineering. The platform's flexibility has led to its rapid adoption in the open-source community, with over 60,000 GitHub stars, and the recent launch of the Comfy Registry to share custom nodes. AI

    AI Engineering for Art — with comfyanonymous, of ComfyUI
  48. 2024 in Agents [LS Live! @ NeurIPS 2024]

    The Latent Space podcast discussed the state of LLM agents in 2024, highlighting significant progress and future predictions. Professor Graham Neubig identified eight key challenges in agent development, including interfaces, LLM selection, planning, and evaluation. The discussion covered advancements in coding agents like OpenHands (formerly OpenDevin), which leads the SWE-Bench Full leaderboard, and other notable agent applications in IDEs and customer support, with companies like Cognition AI and Perplexity seeing substantial growth. AI

    2024 in Agents [LS Live! @ NeurIPS 2024]
  49. Genesis: Generative Physics Engine for Robotics (o1-mini version)

    Smol AI has released Genesis, a generative physics engine designed for robotics. The engine aims to simulate physical interactions and environments, enabling more realistic training and operation of robots. This release includes an 'o1-mini' version, suggesting a scaled-down or specialized iteration for specific applications. AI

  50. Full-duplex, real-time dialogue with Kyutai

    Kyutai, an open science research lab, has released a real-time speech-to-speech AI assistant. This release predates similar functionality teased by OpenAI. The lab's recent Moshi models and future developments were discussed in a podcast episode, alongside the broader AI ecosystem in France and the use of smaller AI models. AI

    Full-duplex, real-time dialogue with Kyutai