Brief

last 24h

[50/166] 186 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI · 1d

Beyond Routing: Characterising Expert Tuning and Representation in Vision Mixture-of-Experts

Researchers have developed new methods to understand the internal workings of Mixture-of-Experts (MoE) models in computer vision. By analyzing how different visual categories are routed to specific experts and examining the tuning of these experts to various inputs, they found that an animate-inanimate distinction is a dominant factor in expert partitioning. The study reveals that experts tune to broader, continuous visual and semantic dimensions beyond simple category boundaries, highlighting the benefits of moving beyond basic routing analyses for a deeper understanding of MoE specialization. AI

IMPACT Provides novel methods for interpreting the specialized functions within complex vision models, advancing AI research.
TOOL · arXiv cs.CL · 1d

Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies

A new research paper proposes the Structural Depth Hypothesis (SDH) to explain how self-training restructures language models. The study found that while surface-level linguistic features like discourse markers increase, deeper syntactic structures such as questions and passives decline. This effect was observed across multiple models and architectures, suggesting it's a specific outcome of self-training rather than a general language model behavior. AI

IMPACT This research suggests that self-training may lead to LLMs that are superficially complex but lack deep syntactic understanding, impacting data curation and text detection.
RESEARCH · arXiv cs.CL · 2d · [2 sources]

CEPO: RLVR Self-Distillation using Contrastive Evidence Policy Optimization

Researchers have developed two novel self-distillation techniques for language models to improve performance on complex reasoning tasks. AVSD (Adaptive-View Self-Distillation) balances consensus and view-specific signals from multiple teacher models to provide more reliable supervision. CEPO (Contrastive Evidence Policy Optimization) sharpens the reward signal by distinguishing decisive reasoning steps from filler tokens, using contrastive learning against incorrect answers. Both methods show significant improvements on mathematical and code-generation benchmarks, outperforming existing self-distillation baselines. AI

IMPACT These new self-distillation techniques offer improved methods for training LLMs, potentially leading to more capable models for complex reasoning tasks.
TOOL · arXiv cs.CL · 1d

Reinforcing Human Behavior Simulation via Verbal Feedback

Researchers have developed DITTO, a new model that learns to simulate human behavior by incorporating verbal feedback as a primary signal in reinforcement learning. This approach, detailed in a new paper, treats subjective and multi-faceted guidance as a first-class input, optimizing for improved rollouts based on this feedback. DITTO demonstrated a 36% improvement over its base model and outperformed GPT-5.4 on six benchmarks within the newly introduced SOUL suite, which comprises ten tasks across various human-like behavior simulations. AI

IMPACT This research introduces a novel method for training LLMs to better simulate human behavior, potentially improving their utility in roles requiring nuanced social understanding.
- GPT-5.4
- SOUL
- DITTO
TOOL · arXiv cs.CL · 1d

Training Language Agents to Learn from Experience

Researchers have developed a new framework called In-context Training (ICT) to evaluate how language agents can improve their performance on future tasks by learning from past experiences. This approach trains a 'reflector' model to generate system prompts that guide an 'actor' model, enabling cross-task self-improvement without human examples. Experiments in ALFWorld and MiniHack demonstrated that agents trained with ICT outperformed baselines and even generalized to new environments, suggesting that the ability to learn from experience can itself be learned. AI

IMPACT Enables language agents to generalize learning across tasks, potentially accelerating development of more adaptable AI systems.
TOOL · arXiv cs.CL · 2d

When Reasoning Supervision Hurts: TTCW-Based Long-Form Literary Review Generation

Researchers have developed a new dataset containing over 260,000 long-form stories, each annotated with creativity scores and review comments based on the Torrance Test of Creative Writing (TTCW). They fine-tuned Qwen3 models on this data to generate literary reviews, finding that models trained without explicit reasoning supervision performed better. The study suggests that for structured, rubric-based review generation, reasoning supervision may not be beneficial and can even lead to irrelevant or repetitive outputs. AI

IMPACT Introduces a novel dataset and methodology for AI-driven literary review generation, potentially improving automated evaluation of creative writing.
- Qwen3
- Torrance Test of Creative Writing (TTCW)
FRONTIER RELEASE · Mastodon — fosstodon.org · 1d · [4 sources]

Gemini 3.5 and Antigravity 2.0 headline Google I/O 2026 reveal: Google I/O 2026 unveils Gemini 3.5, Antigravity 2.0, and WebMCP, a proposed open web standard re

Google announced Gemini 3.5 and Antigravity 2.0 at Google I/O 2026, alongside the proposed WebMCP standard. WebMCP aims to recast the web's developer stack around AI agents that can ship code. Chrome 149 is now offering an origin trial for WebMCP, allowing websites to expose structured tools directly to AI agents. AI

IMPACT New AI models and a proposed web standard could reshape web development and AI agent capabilities.
RESEARCH · arXiv cs.AI · 3d · [2 sources]

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Two new research papers explore methods to improve multimodal large language models (MLLMs) by addressing challenges in data curation and fine-grained visual understanding. One paper proposes a framework that trains MLLMs using only pairwise modalities, reducing the need for extensive human-curated datasets. The other paper introduces Vision-OPD, a self-distillation technique that helps MLLMs better focus on crucial details within images, improving their performance on fine-grained visual tasks. AI

IMPACT These papers introduce novel techniques to enhance multimodal LLM capabilities, potentially leading to more efficient training and improved performance in fine-grained visual understanding tasks.
FRONTIER RELEASE · dev.to — LLM tag · 5d · [4 sources]

DeepSeek V4 Complete Guide — 1.6T MoE with 1M Context at 73% Lower Cost

DeepSeek V4, an open-weight model family, has been released with a 1.6-trillion-parameter Mixture-of-Experts architecture that activates only 49 billion parameters per token. This new model boasts a 1-million-token context window and significantly reduced inference costs, achieving up to 73% lower costs than its predecessor due to innovations like Hybrid Attention. The V4 family, available on Hugging Face, offers comparable quality to leading models like GPT-5.4 and Claude Opus 4.6 at a fraction of the price, with optimized hardware performance for NVIDIA Blackwell. AI

IMPACT Sets a new standard for efficiency in large MoE models, making advanced AI capabilities more accessible and affordable for developers.
RESEARCH · arXiv cs.CL · 2d · [2 sources]

LambdaPO: A Lambda Style Policy Optimization for Reasoning Language Models

Researchers have introduced LamPO (Lambda Style Policy Optimization) and LambdaPO, novel methods for enhancing reasoning in language models. These approaches move beyond traditional group-relative objectives by using pairwise decomposed advantages, which better capture subtle differences in response quality. Experiments on various benchmarks with models like Qwen3 and Phi-4-mini show improved performance and training stability compared to existing methods. AI

IMPACT Introduces new techniques for more stable and efficient training of reasoning language models.
RESEARCH · arXiv cs.AI · 3d · [2 sources]

ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models

Researchers have developed new benchmarks and training frameworks to improve the spatial reasoning capabilities of Vision-Language Models (VLMs). One approach, ArchSIBench, introduces a comprehensive benchmark focusing on architectural spatial intelligence, revealing significant gaps between current VLMs and human performance, particularly for trained architects. Another method, SAGE, uses a self-evolving framework with geometric logic consistency to enhance spatial reasoning by ensuring logical coherence across transformed inputs, demonstrating improvements on existing benchmarks. AI

IMPACT Advances in spatial reasoning for VLMs could enhance their utility in robotics, 3D scene understanding, and navigation tasks.
RESEARCH · arXiv cs.AI · 3d · [2 sources]

Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap

Two new research papers delve into the intricacies of tabular foundation models (TFMs), exploring their performance and ensemble strategies. The first paper provides a mechanistic study, analyzing how different TFM architectures converge in accuracy and identifying their specific inductive biases and failure modes. The second paper investigates ensembling techniques for TFMs, revealing a diversity ceiling and a calibration trap where combining models can yield diminishing returns and even degrade performance. AI

IMPACT These studies offer deeper insights into the internal workings and practical application of tabular foundation models, potentially guiding future development and deployment strategies.
RESEARCH · arXiv cs.AI · 3d · [5 sources]

When Skills Don't Help: A Negative Result on Procedural Knowledge for Tool-Grounded Agents in Offensive Cybersecurity

Recent research indicates that while AI 'Skills' can improve agent performance in cybersecurity, their benefit diminishes significantly in offensive scenarios, potentially even degrading performance. This is attributed to a lack of 'environment-feedback bandwidth,' where rich, low-latency observations from the environment reduce the need for pre-programmed procedural knowledge. Meanwhile, frontier AI models like Anthropic's Claude Mythos and OpenAI's GPT-5.5-Cyber are demonstrating advanced capabilities in discovering zero-day vulnerabilities and synthesizing exploits, reshaping both offensive and defensive cybersecurity strategies. AI

IMPACT Frontier AI models are rapidly advancing offensive and defensive cybersecurity capabilities, while research highlights limitations of current agent skill frameworks in complex threat environments.
TOOL · Hugging Face Daily Papers · 1d

Modular Multimodal Classification Without Fine-Tuning: A Simple Compositional Approach

Researchers have developed CoMET, a novel method for multimodal classification that leverages frozen pre-trained backbones and Tabular Foundation Models (TFMs). This approach uses Principal Component Analysis (PCA) to compress modality embeddings before feeding them into a TFM, eliminating the need for fine-tuning. For improved representation quality, especially when CLS tokens are misaligned, they propose PALPooling, an adaptive token pooler. CoMET achieves state-of-the-art results on various multimodal benchmarks and can handle large-scale datasets with over 500,000 samples and 2,000 classes without any training. AI

IMPACT This method challenges traditional fine-tuning approaches, potentially enabling faster and more scalable multimodal classification across various domains.
SIGNIFICANT · Google DeepMind Română(RO) · 3d · [2 sources]

Introducing Gemini Omni

Google DeepMind has unveiled Gemini Omni, a new multimodal AI model capable of understanding and processing information across text, audio, and video inputs simultaneously. This advanced model is designed to handle complex, real-world scenarios by integrating various data streams for more comprehensive comprehension. Gemini Omni aims to enhance user interaction and unlock new applications by enabling more natural and intuitive AI assistance. AI

IMPACT Enhances AI's ability to process complex, real-world scenarios by integrating multiple data streams.
- Google DeepMind
- Gemini Omni
SIGNIFICANT · Google DeepMind · 3d · [12 sources]

Simulate real-world places with Project Genie and Street View

Google DeepMind has integrated its Project Genie world model with Google Maps Street View, allowing users to generate interactive simulations of real-world locations. This new capability, announced at Google I/O, enables users to reimagine places with creative prompts, such as transforming Chicago into a desert landscape. The feature is rolling out to Google AI Ultra subscribers, initially in the U.S., with plans for global expansion. While still experimental and not yet physics-aware, the integration aims to enhance applications in robotics training, gaming, and educational experiences. AI

IMPACT Enhances AI-driven simulation capabilities for robotics, gaming, and personalized experiences by grounding generative models in real-world data.
TOOL · X — MiniMax AI · 23h

600+ new voices powered by MiniMax Speech 2.8 Turbo are now on Together AI @togethercompute 🎙️✨

MiniMax AI has released over 600 new voices through its Speech 2.8 Turbo model. These voices are now accessible on the Together AI platform. This expansion aims to provide a wider range of synthetic speech options. AI

IMPACT Expands the availability of synthetic voice options for developers and users on the Together AI platform.
COMMENTARY · r/Anthropic · 12h

Word on the street

A Reddit post discusses rumors about Anthropic's upcoming Claude 4.5 model, suggesting it might feature a multimodal architecture and improved reasoning capabilities. The post also touches on potential advancements in context window length and overall performance, hinting at a significant upgrade over previous versions. AI

IMPACT Rumors suggest potential advancements in multimodal AI and context window size, which could influence future AI development and application.
- Anthropic
- Claude 4.5
COMMENTARY · r/cursor · 16h

What models for asking, planning, and building modes do you use right now?

A Reddit user is soliciting opinions on which AI models are best suited for different tasks within the Cursor IDE. They specifically mention using Anthropic's Sonnet 4.6 and Claude Opus 4.7, alongside OpenAI's GPT 5.5, for asking questions, planning code, and building features. The user is also inquiring if a new model, Composer 2.5, could replace any of their current choices. AI

IMPACT Provides insight into current user preferences and potential emerging models for AI-assisted software development.
- Cursor
- Anthropic
- OpenAI
- GPT 5.5
- Sonnet 4.6
- Opus 4.7
- Composer 2.5
COMMENTARY · X — SemiAnalysis · 21h · [2 sources]

The full chat with Mishek Musa on how ADI is shrinking inference down to the edge and setting up physical leaderboards for the robotics community.

SemiAnalysis is focusing on the release of advanced AI models, referred to as "god models." The organization is also exploring the practical applications of AI, such as shrinking inference capabilities to the edge for robotics, as discussed in a chat with Mishek Musa. AI

IMPACT Focuses on the development and application of advanced AI models and edge robotics.
- Mishek Musa
- SemiAnalysis
COMMENTARY · 36氪 (36Kr) 中文(ZH) · 18h · [5 sources]

Institution: AliExpress Ranks First Among Cross-border E-commerce Platforms in the Korean Market

Nvidia reported a strong first quarter with a net profit of $58.3 billion, a 211% increase year-over-year. Google CEO Sundar Pichai announced that Gemini AI has reached 900 million monthly active users, with daily requests increasing sevenfold. In other news, AliExpress has become the top cross-border e-commerce platform in South Korea, and Japan is considering a supplementary budget of approximately 3 trillion yen. AI

IMPACT Nvidia's strong earnings and Gemini's user growth highlight the continued rapid expansion and monetization of AI technologies.
- Google
- AliExpress
- Sundar Pichai
- Gemini
- South Korea
- Nvidia
- Japan
- 36氪
RESEARCH · Mastodon — fosstodon.org · 1d · [2 sources]

African researchers push multilingual AI to improve health access and local innovation A University of Pretoria lecture highlighted progress on African-language

African researchers are developing AI models to support over 40 languages across the continent, aiming to improve access to essential services like healthcare. This initiative includes advancements in speech recognition and the creation of a pan-African large language model. The goal is to bridge language barriers and enhance digital health access, patient communication, and public service delivery for underserved communities. AI

IMPACT Multilingual AI models can significantly improve access to healthcare and public services across Africa by overcoming language barriers.
FRONTIER RELEASE · Mastodon — mastodon.social Polski(PL) · 1d · [4 sources]

🤖 [TechCrunch] Google just declared itself an AI design contender at IO 2026 🔗 More: https://techcrunch.com/2026/05/19/ai-desi

Google has announced Gemini Omni, a new AI model, and integrated new AI features into Google Workspace. The company also signaled its strong focus on AI design tools, aiming to be a major contender by the 2026 I/O conference. AI

IMPACT New AI model and Workspace integrations signal Google's continued push to embed generative AI across its product suite.
RESEARCH · arXiv cs.CL · 3d · [2 sources]

Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models

Two new research papers explore methods to maintain the integrity of reasoning processes in large language models. The first paper, 'Reasoning-Trace Collapse,' identifies how fine-tuning on standard instruction-response data can degrade explicit reasoning traces, even when final answers remain correct. It proposes a structural evaluation framework to assess reasoning reliability and suggests loss-masking strategies to mitigate this collapse. The second paper, 'Stop When Reasoning Converges,' introduces PUMA, a framework that detects semantic redundancy in reasoning steps to enable early exiting. This method aims to reduce token usage and latency by stopping the reasoning process once it has stabilized, while preserving answer accuracy and the coherence of the retained reasoning chain. AI

IMPACT These papers highlight critical issues in LLM reasoning integrity and efficiency, suggesting new evaluation metrics and inference techniques that could lead to more reliable and performant models.
RESEARCH · dev.to — LLM tag · 3d · [6 sources]

Designing Nvidia-Grade Ising Quantum AI Models for Robust Qubit Calibration

Nvidia has released open-source Ising quantum AI models designed to automate and improve the calibration of quantum processors. These models, which include a vision-language model for proposing calibration actions and CNNs for error correction decoding, are intended to be integrated into existing quantum control stacks. By treating calibration as an AI inference problem, similar to how LLMs are deployed, Nvidia aims to enhance the speed, accuracy, and robustness of quantum hardware operations, while also emphasizing the need for governance and security protocols. AI

IMPACT Enables more robust and automated calibration for quantum hardware, potentially accelerating quantum computing development.
- Nvidia
- LLM
- Cadence
- GPU
- AI Act
- Ising
- Quantum AI
- Qibo
- Qibolab
- Ubuntu Inference Snaps
- CUDA-Q
- Qibocal
- ChipStack AI Super Agent
RESEARCH · Wired — AI · 4d · [2 sources]

I Gave My OpenClaw Agent a Physical Body

An AI agent named OpenClaw was successfully integrated with a physical robot arm, enabling it to configure the arm, grasp objects, and even train another AI model for specific tasks. This development, utilizing an open-source robot arm and AI coding assistance, suggests a potential breakthrough in robotics by simplifying the control and training processes. Researchers are developing benchmarks like CaP-X to evaluate AI models' robotic capabilities, with Gemini showing promising results in multimodal understanding for physical world interactions. AI

IMPACT Demonstrates AI's growing capability in physical robotics, potentially simplifying complex control and training tasks for broader adoption.
- Google DeepMind
- Nvidia
- Jensen Huang
- ChatGPT
- Claude
- Gemini
- OpenClaw
- Codex
- Stanford
- UC Berkeley
- Carnegie Mellon University
- Spencer Huang
- Ken Goldberg
- LeRobot 101
- CaP-X
RESEARCH · arXiv cs.CL · 6d · [7 sources]

Dynamic Chunking for Diffusion Language Models

Researchers are exploring new methods to improve the efficiency and scalability of diffusion language models (DLMs) for generating long sequences of text. One approach, Block Approximate Sparse Attention (BA-Att), accelerates attention computation by downsampling the attention space, achieving significant speedups while maintaining near full-attention performance. Another development, Dynamic Chunking Diffusion Models (DCDM), replaces fixed positional blocks with content-defined semantic chunks to better capture sequence structure. Additionally, advancements in continuous diffusion models, like RePlaid, demonstrate competitive performance against discrete DLMs, suggesting they are a viable and scalable alternative. AI

IMPACT New techniques promise faster and more scalable text generation from diffusion models, potentially enabling longer and more coherent outputs.
RESEARCH · Hugging Face Daily Papers · 6d · [6 sources]

PhyWorld: Physics-Faithful World Model for Video Generation

Researchers are developing new methods to improve autoregressive video generation, focusing on extending the length and quality of generated videos. Several papers introduce techniques to manage long-term temporal consistency and adaptively select relevant historical frames, moving beyond fixed memory allocations. These advancements aim to enhance video generation models for applications like physics simulation and interactive content creation, often without requiring additional training. AI

IMPACT Advances in long video generation could enable more realistic simulations and interactive content creation tools.
- Echo-Forcing
- VBench-Long
- NarrLV
- VBench
- MIGA
- PhyWorld
- Hugging Face
- FlowLong
- HunyuanVideo
- DySink
- arXiv
RESEARCH · arXiv cs.LG · 6d · [2 sources]

SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining

Researchers have developed SpectralEarth-FM, a new foundation model designed to process and fuse hyperspectral imagery with other Earth observation data like multispectral, radar, and temperature readings. This model utilizes a hierarchical transformer architecture that can handle varying spectral dimensions and integrates a cross-sensor fusion module. To train SpectralEarth-FM, a large dataset called SpectralEarth-MM was curated, containing over 40TB of co-located data from multiple satellite sensors, enabling state-of-the-art results on downstream tasks. AI

IMPACT Advances hyperspectral data processing and fusion, enabling more comprehensive Earth observation analysis.
SIGNIFICANT · dev.to — Claude Code tag Svenska(SV) · 1w · [13 sources]

What Are Claude Skills

Anthropic's Claude AI can now utilize "Skills," which are modular, reusable instruction packages stored in folders. Each skill consists of a SKILL.md file containing a description and plain Markdown instructions, allowing Claude to dynamically discover and execute specific tasks. This feature aims to enhance Claude's capabilities beyond one-off prompts, enabling more complex and efficient workflows for users. AI

IMPACT Enhances Claude's functionality by enabling modular, reusable task execution, potentially improving user productivity and workflow efficiency.
- Anthropic
- Claude
RESEARCH · arXiv cs.CV · 6d · [11 sources]

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

Researchers have introduced several advancements in Diffusion Transformer (DiT) architectures for image generation and manipulation. One paper explores the use of register tokens in pixel-space DiTs to improve convergence and generation quality, finding they produce cleaner feature maps. Another proposes HyperDiT, which uses hyper-connected cross-scale interactions and registers to bridge semantic and pixel manifolds for high-fidelity generation. ElasticDiT focuses on efficiency for mobile devices by dynamically adjusting architecture and using sparse attention, while DreamSR enhances super-resolution by combining global and local textual features. Finally, DealMaTe and MaTe simplify material transfer by eliminating text guidance and relying on image inputs within DiT frameworks. AI

IMPACT These advancements in Diffusion Transformers offer improved image generation fidelity, efficiency for mobile devices, and new capabilities in super-resolution and material transfer.
- FLUX
- Diffusion Transformer
- MaTe
- DreamSR
- HyperDiT
- ElasticDiT
- Stable Diffusion-3
- DealMaTe
- ImageNet
- VAE
- ControlNet
RESEARCH · arXiv cs.AI · 1w · [4 sources]

TFGN: Task-Free, Replay-Free Continual Pre-Training Without Catastrophic Forgetting at LLM Scale

Researchers have developed new architectural approaches to address catastrophic forgetting in large language models during continual pre-training and fine-tuning. One method, TFGN, introduces an overlay that allows for parameter-efficient updates without altering the core transformer, demonstrating significant retention of prior knowledge across diverse domains and model scales. Another approach, UAM, inspired by biological vision, uses a dual-stream architecture to separate semantic understanding from action control, preserving multimodal capabilities during VLA model training. These advancements aim to enable models to learn continuously without degrading performance on previously acquired knowledge. AI

IMPACT New architectural designs for LLMs and VLA models promise improved continual learning capabilities, reducing knowledge degradation during fine-tuning and pre-training.
- OpenAI
- Python
- TFGN
- Chinese
- Prose
- LLaMA 3.1
- GPT-2
- LLM
RESEARCH · Hugging Face Daily Papers · 1w · [3 sources]

Beyond Parameter Aggregation: Semantic Consensus for Federated Fine-Tuning of LLMs

Researchers have developed novel methods for federated fine-tuning of large language models, moving beyond traditional parameter aggregation. One approach focuses on exchanging model outputs on a shared prompt set to achieve semantic consensus, drastically reducing communication costs and accommodating heterogeneous architectures. Another method, CLAIR, specifically addresses LoRA fine-tuning in federated settings, offering contamination-aware recovery of the shared LoRA subspace and improved performance over standard federated averaging. AI

IMPACT These new federated learning techniques could enable more efficient and secure collaborative fine-tuning of LLMs, especially in scenarios with private data or heterogeneous hardware.
RESEARCH · Hugging Face Daily Papers · 1w · [4 sources]

Variational Linear Attention: Stable Associative Memory for Long-Context Transformers

Researchers are developing new attention mechanisms to handle increasingly long contexts in large language models. One approach, Runtime-Certified Bounded-Error Quantized Attention, uses tiered KV caches to compress memory while guaranteeing fallback to exact attention, ensuring quality for tasks like language modeling and retrieval. Another method, DashAttention, employs differentiable sparse hierarchical attention to adaptively select relevant tokens, achieving high sparsity with comparable accuracy to full attention and offering improved performance over existing hierarchical methods. Variational Linear Attention (VLA) reframes linear attention as a regularized least-squares problem, limiting state norm growth and improving associative recall accuracy, while also achieving significant speedups. AI

IMPACT These advancements in attention mechanisms promise to significantly improve the efficiency and capability of LLMs in processing and understanding long contexts.
SIGNIFICANT · NVIDIA Blog · 1w · [8 sources]

Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs

NVIDIA has begun delivering its new Vera CPU, designed specifically for agentic AI workloads, to leading AI labs including OpenAI, Anthropic, and xAI. This move signifies NVIDIA's strategic expansion into custom CPU development to support the growing demands of AI agents beyond GPUs. Concurrently, NVIDIA CEO Jensen Huang revealed the company's substantial investment strategy, having invested $43 billion in startups and committed significant capital to AI companies like OpenAI and Anthropic, aiming to deepen its ecosystem reach and solidify its hardware dominance. AI

IMPACT NVIDIA's new Vera CPU launch and substantial startup investments signal a deepening integration of specialized hardware into the AI ecosystem, potentially accelerating agent development and reinforcing NVIDIA's market influence.
- Elon Musk
- Jensen Huang
- James Bradbury
- Anthropic
- OpenAI
- NVIDIA
- Oracle Cloud Infrastructure
- Ian Buck
- Sachin Katti
- Vera CPU
- SpaceXAI
- Dario Amodei
- xAI
TOOL · r/cursor · 3d · [9 sources]

Composer 2.5 has been released (2x usage for the next week)

Users of the Cursor IDE are reporting that the new Composer 2.5 model significantly outperforms previous versions and even larger models like GPT-4.5. Many are finding Composer 2.5 to be faster, more accurate, and notably cheaper, leading them to adopt it as their default for most coding tasks. This shift is reducing their reliance on more expensive, high-end models for everyday development work. AI

IMPACT This update offers a faster, more accurate, and cost-effective coding assistant within the Cursor IDE, potentially reducing developer reliance on more expensive models for daily tasks.
SIGNIFICANT · 36氪 (36Kr) 中文(ZH) · 1w · [43 sources]

Nvidia: This year's CPU revenue is expected to reach $20 billion

Google has launched its Gemini 3.5 series of models, including updates to its large context window capabilities. Separately, Nvidia's CFO expressed confidence in significant revenue from their Blackwell and Vera Rubin chips, projecting substantial income between 2025 and 2027. Airbnb is expanding its offerings to include grocery delivery, car rentals, and AI-powered tools for trip planning and property comparison. AI

IMPACT Major AI model updates and hardware revenue projections signal continued industry growth and innovation.
- Nvidia
- DeepSeek
- Alibaba
- Google
- Vera CPU
- Gemini 3.5
- Eddie Wu
- Blackwell
- Vera Rubin
- AI
- Airbnb
SIGNIFICANT · Wired — AI · 3w · [120 sources]

Everything Announced at Google I/O 2026: Gemini, Search, Smart Glasses

Google has announced significant updates to its Gemini AI model and Search capabilities at its I/O 2026 event. The company is integrating Gemini more deeply into its core services, including Search, Docs, and YouTube, enabling users to generate and export files directly from chat interfaces. New Gemini 3.5 models, Flash and Pro, are being rolled out, with Flash powering the enhanced Search experience. Google aims to transform Search from an information retrieval tool into an action-oriented platform, allowing AI agents to perform tasks and provide direct solutions. AI

IMPACT Google's integration of Gemini AI agents into Search and Workspace aims to shift user interaction from information retrieval to task execution.
FRONTIER RELEASE · Engadget · 3w · [56 sources]

Ask YouTube compiles video answers to your questions

Google has unveiled Gemini Omni, a new multimodal AI model capable of generating and editing video from diverse inputs like text, images, and audio. This advanced model, which understands physics and real-world knowledge, is being integrated into the Gemini app, YouTube Shorts, and the Flow creative studio. Additionally, Google is enhancing its YouTube platform with an AI-powered conversational search feature called 'Ask YouTube,' which compiles video answers to user queries and offers follow-up questions for refined results. AI

IMPACT Sets new benchmarks for multimodal AI, enabling complex video creation and editing directly from diverse inputs.
- Databricks
- Google
- AI agents
- Unity Catalog
- YouTube Shorts
- Remix
- Gemini Omni
- SynthID
- Google I/O
- Ask YouTube
- Claude
- Gemini app
- Veo 3.1
- Gemini Flash
- ChatGPT
SIGNIFICANT · Engadget · 1w · [9 sources]

Volvo reveals $58,400 starting price for the EX60

Google is integrating its Gemini AI into Volvo vehicles, starting with the upcoming EX60 SUV. Gemini will leverage the car's external cameras to interpret parking signs and provide more detailed navigation through an "Immersive Navigation" feature. This enhanced AI integration is part of a broader overhaul of Android Auto and Google-built-in car systems, aiming for greater contextual awareness and automation for drivers. AI

IMPACT Enhances in-car AI capabilities, potentially improving driver safety and convenience through advanced interpretation and navigation features.
- EX60
- Rivian R2
- BMW iX3
- Google
- Volvo
- Gemini
- Android Auto
- Android Automotive
- Qualcomm
- Material 3
FRONTIER RELEASE · Simon Willison · 3w · [69 sources]

Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

Google has launched Gemini 3.5 Flash, a new model designed for agentic workflows and coding tasks, available immediately across its consumer and developer platforms. This release also introduces Gemini Omni for multimodal generation, particularly video, and the Antigravity agent stack. While Gemini 3.5 Flash offers significant speed and a 1 million token context window, its pricing has increased substantially compared to previous versions, aligning with a trend of rising costs among major AI labs. AI

IMPACT Sets a new standard for agentic AI performance and multimodal capabilities, potentially accelerating enterprise adoption and pushing competitors.
COMMENTARY · 36氪 (36Kr) 中文(ZH) · 3d · [3 sources]

US large-cap tech stocks mixed in pre-market trading, Nvidia up over 2%

OpenAI has announced that a country will be the first to offer free access to ChatGPT Plus. Meanwhile, Nvidia's stock saw a pre-market increase of over 2%, alongside other major tech stocks showing mixed performance. The news flashes also touched upon a significant price drop in several gasoline-powered vehicles and a local story about an individual creating an AI-generated short film for a low cost. AI

IMPACT OpenAI's move could signal broader accessibility trends for advanced AI tools.
- Warren Buffett
- Microsoft
- Google
- Amazon
- Nvidia
- Intel
- Jensen Huang
- 36氪
- Tesla
- Micron Technology
- ChatGPT Plus
- OpenAI
RESEARCH · Hugging Face Daily Papers · 1w · [5 sources]

Improving Diffusion Posterior Samplers with Lagged Temporal Corrections for Image Restoration

Researchers have developed new methods to improve diffusion models for various inverse problems. One approach, AVIS, uses autoregressive diffusion models to accelerate video restoration, significantly reducing latency and increasing throughput. Another development, LAMP, enhances diffusion posterior samplers by incorporating lagged temporal corrections for image restoration tasks. Additionally, Stein Diffusion Guidance (SDG) offers a training-free framework for posterior correction, enabling more effective guidance in low-density regions for tasks like image generation and protein docking. AI

IMPACT These advancements in diffusion models promise faster and more accurate solutions for complex tasks like video restoration and image generation, potentially enabling real-time applications.
MEME · Mastodon — fosstodon.org (SL) · 12h

google.jpg # AI # google # RFC3514 # evil

Google has released a new AI model, though details are scarce. The model is reportedly named "google.jpg" and is associated with the RFC3514 standard, which is a humorous internet standard for determining "evil" in network traffic. The context suggests this may be a playful or satirical release rather than a serious AI advancement. AI

IMPACT This appears to be a non-serious or satirical release, with no discernible impact on AI operations.
- Google
- google.jpg
RESEARCH · arXiv cs.AI · 2w · [4 sources]

TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization

Researchers are exploring advanced methods for aligning large language models with human preferences, moving beyond traditional Reinforcement Learning from Human Feedback (RLHF). New approaches like Direct Preference Optimization (DPO) offer simpler implementations but have theoretical limitations. Papers introduce refinements such as Constrained Preference Optimization (CPO) and Topology- and Uncertainty-Aware DPO (TUR-DPO) to address these shortcomings and improve alignment guarantees. AI

IMPACT New alignment techniques like CPO and TUR-DPO offer improved theoretical guarantees and empirical performance for LLMs.
RESEARCH · Smol AINews · 2w · [2 sources]

not much happened today

Recent AI news highlights advancements in coding agents and model releases. Companies are focusing on productionizing agents with observability and automation loops, moving beyond simple chat interfaces. New models like Cursor's Composer 2.5 and Alibaba's Qwen 3.7 show improved performance, particularly in coding and reasoning tasks. OpenAI also announced a significant breakthrough in discrete geometry, with a general-purpose reasoning model disproving a long-standing mathematical conjecture, indicating potential for broader scientific applications. AI

IMPACT New models and research are pushing the boundaries of AI capabilities in reasoning, coding, and scientific discovery.
- Anthropic
- OpenAI
- LangChain
- Claude Code
- Cognition
- Alibaba
- GitHub Copilot CLI
- François Chollet
- Cursor AI
- Composer 2.5
- Qwen3.7
- Devin Auto-Triage
- Cursor
- LangSmith Engine
- Cohere
- Claude
- Command A+
- Qwen 3.7
SIGNIFICANT · dev.to — Claude Code tag · 3w · [24 sources]

Claude Code vs ChatGPT Codex: Two Official Agents, One Choice You Don't Have to Make

Anthropic's Claude is reportedly surpassing OpenAI's ChatGPT in market share and revenue, with some users switching due to perceived advantages in specific tasks. While ChatGPT's Codex agent excels at remote, asynchronous coding tasks, Claude Code offers a more interactive, real-time pair-programming experience. Businesses are re-evaluating their AI strategies and investments in light of this market shift, with some reporting improved customer satisfaction and productivity after adopting Claude. AI

IMPACT Companies are reassessing AI investments and strategies as Anthropic's Claude gains market share over OpenAI's ChatGPT.
- Scale AI
- Tiezhen Wang
- Google DeepMind
- Hugging Face
- Meta
- ChatGPT
- Claude
- GPT-5.4
- Dylan Patel
- SemiAnalysis
- Dario Amodei
- Gemini 3.1 Pro
- DeepSeek
- Patrick O'Shaughnessy
- OpenAI
- V4
- Zuckerberg
- Susan Zhang
- Anthropic
- DALL-E
- Sora
- Gemini
- SAP
- Claude Code
- DeepSeek V4
- Claude Max
- ChatGPT Codex
RESEARCH · arXiv cs.LG · 3w · [15 sources]

BROS: Bias-Corrected Randomized Subspaces for Memory-Efficient Single-Loop Bilevel Optimization

Researchers have developed new methods for improving machine learning models in various complex scenarios. One paper introduces a nonparametric learning framework for dynamic pricing with limited feedback and nonstationary market conditions, offering revenue guarantees. Another study presents BROS, a memory-efficient bilevel optimization method that significantly reduces peak memory usage while maintaining competitive convergence rates for hyperparameter learning. Additionally, a new approach models surgical team dynamics in real-time using time-expanded interaction graphs, providing actionable insights for improved performance. AI

IMPACT Advances in nonparametric learning, bilevel optimization, and team dynamics modeling offer new tools for AI applications.
- Machine Learning
- arXiv
- ViT
- Computer Science
- BROS
- PRISM-CTG
- AirFM-DDA
RESEARCH · arXiv cs.LG · 3w · [12 sources]

DGPO: Distribution Guided Policy Optimization for Fine Grained Credit Assignment

Researchers have introduced Distribution Guided Policy Optimization (DGPO), a new reinforcement learning framework designed to improve how large language models handle complex reasoning tasks. Current methods struggle with assigning credit for specific steps within long chains of thought, hindering the discovery of new reasoning paths. DGPO addresses this by using distribution deviation as a guiding signal instead of a strict penalty, aiming for more stable and effective model alignment. AI

IMPACT This new framework could lead to more capable LLMs that can perform complex reasoning tasks more effectively.
RESEARCH · Mastodon — sigmoid.social 日本語(JA) · 3w · [133 sources]

NVIDIA Brings Agents to Life with DGX Spark and Reachy Mini https:// huggingface.co/blog/nvidia-rea chy-mini ※AI-generated automatic post (headline + link) # AI # GenerativeAI # LLM # AIGenerated

Hugging Face has announced several updates and collaborations across its platform. These include enhancements to OCR pipelines with open models, the integration of Sentence Transformers, and the release of Transformers.js v4. Additionally, Hugging Face is strengthening AI security through a partnership with VirusTotal and introducing new models like Granite 4.0 Nano and AnyLanguageModel for efficient LLM operations. AI

IMPACT Hugging Face continues to expand its ecosystem with new models, tools, and collaborations, enhancing capabilities in OCR, AI security, and efficient LLM deployment.
- LLM
- Hugging Face
- NVIDIA
- LeRobot
- NVIDIA Isaac
- AprielGuard
- llama.cpp
- Google Cloud
- AnyLanguageModel
- Anthropic
- AMD
- IBM
- VirusTotal
- Transformers.js
- ServiceNow
- Sentence Transformers
- Granite 4.0 Nano