Brief

last 24h

[5/5] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · 雷峰网 (Leiphone) 中文(ZH) · 10h

Exclusive | Peking University's Dong Hao: "Scaling Laws That Only Stay at the Data Level Cannot Teach General Robots"

Dong Hao, a vice professor at Peking University and chief scientist at Shangwei Qiyuan, proposes a new paradigm for embodied AI development. He argues that current methods relying solely on imitation learning or reinforcement learning have limitations, particularly in handling errors and achieving general intelligence. Hao advocates for a two-dimensional "Scaling Law" that considers both the quantity of data and the number of tasks, aiming for robots that become more efficient and capable with more learning. AI

IMPACT This new 2D Scaling Law could accelerate the development of general-purpose robots by making learning more efficient.
RESEARCH · arXiv cs.AI English(EN) · 6d · [3 sources]

YUBI: Yielding Universal Bidigital Interface for Bimanual Dexterous Manipulation at Scale

Researchers have introduced DuoBench, a new framework for evaluating bimanual robot manipulation, implemented in simulation and partially in the real world. This benchmark includes eleven tasks and a novel evaluation scheme for detailed failure analysis, revealing current policies struggle with complex dual-arm coordination. Separately, the YUBI interface has been developed, featuring a yielding, finger-driven gripper designed for more intuitive and ergonomic data collection for bimanual tasks. YUBI offers advantages over existing systems like UMI in dexterity and efficiency, enabling a large-scale dataset that allows policies to transfer across different robotic platforms. AI

IMPACT These advancements in bimanual manipulation benchmarks and data collection interfaces are crucial for developing more capable robotic foundation models.
- ELEY
- Franka
- FR3 Duo
- DuoBench
RESEARCH · Hugging Face Daily Papers English(EN) · 1w · [3 sources]

VISTA: Vision-Grounded and Physics-Validated Adaptation of UMI data for VLA Training

Researchers have developed VISTA, a framework designed to improve the training of Vision-Language-Action (VLA) models using real-world robot data. The framework addresses challenges such as distorted camera views and physically infeasible human-collected trajectories. VISTA incorporates a new dataset (UMI-VQA) for distorted visual inputs and a validation pipeline to filter out unsafe or impossible robot actions, leading to better policy performance. AI

IMPACT Enhances robot learning by enabling more robust training from real-world data, potentially improving deployment success.
TOOL · 量子位 (QbitAI) 中文(ZH) · 2w

τ0-WM: The Largest Open-Source Embodied World Model for Pre-training is Here

Researchers have introduced τ0-World Model (τ0-WM), an open-source embodied world model trained on a massive 30,000 hours of data, with a significant portion (17,800 hours) derived from real robot teleoperation. This model goes beyond predicting future states by incorporating Test-Time Computation, allowing robots to evaluate and select optimal actions before execution, even correcting for potential errors. τ0-WM demonstrates improved performance on complex manipulation tasks compared to previous models, challenging the conventional approach of reserving real-world data solely for fine-tuning. AI

IMPACT Sets a new precedent for large-scale pre-training with real-world robot data, potentially accelerating embodied AI development.
COMMENTARY · 量子位 (QbitAI) 中文(ZH) · 1mo

VLA is dead, remote control is dead too! Nvidia's number one robot said so.

NVIDIA's Jim Fan declared the end of Visual-Language-Action (VLA) models and remote operation in robotics, advocating for World Action Models (WAM) as the new paradigm. Fan proposed that WAMs, inspired by Large Language Models (LLMs), will leverage next-state prediction and action fine-tuning for robot control. He emphasized a shift towards using first-person human video data as the primary training source, moving away from the limitations of remote operation data collection. AI

IMPACT This commentary signals a potential shift in robotics research and development, moving towards new model architectures and data strategies.
- Robotics: Endgame
- LLM
- WAM
- DreamDojo
- EgoScale
- NVIDIA
- Jim Fan
- Sunday
- Taylor Swift
- Jensen Huang
- Elon Musk
- Andrej Karpathy
- Dream Zero

Brief

Exclusive | Peking University's Dong Hao: "Scaling Laws That Only Stay at the Data Level Cannot Teach General Robots"

YUBI: Yielding Universal Bidigital Interface for Bimanual Dexterous Manipulation at Scale

VISTA: Vision-Grounded and Physics-Validated Adaptation of UMI data for VLA Training

τ0-WM: The Largest Open-Source Embodied World Model for Pre-training is Here

VLA is dead, remote control is dead too! Nvidia's number one robot said so.