Brief

last 24h

[2/2] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 4d

TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

Researchers have developed TimeRewarder, a novel method for learning dense reward signals from passive videos. This technique models temporal distances between frame pairs to estimate task progress, which can then guide reinforcement learning agents. Experiments on ten Meta-World tasks showed TimeRewarder significantly improved success rates and sample efficiency, outperforming manually designed rewards and previous methods. The approach also demonstrated potential in leveraging real-world human videos for scalable reward signal generation. AI

IMPACT Enables more efficient training of reinforcement learning agents by automating reward design from video data.
RESEARCH · arXiv cs.AI English(EN) · 6d · [2 sources]

DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

Researchers have developed a new method called DISC (Decoupling Instruction from State-Conditioned Control) to improve language-conditioned manipulation policies in robotics. DISC structurally separates instruction processing from state-conditioned control, preventing policies from learning shortcuts that bypass language grounding. It achieves this by using a hypernetwork to generate task-specific visuomotor policies directly from instructions, ensuring task awareness is derived solely from language. AI

IMPACT This method could lead to more robust and adaptable robotic systems that better understand and execute complex instructions.

Brief

TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation