MuJoCo
PulseAugur coverage of MuJoCo — every cluster mentioning MuJoCo across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
新的强化学习策略通过一次性生成控制提高效率
研究人员开发了新的强化学习策略方法,旨在提高效率和表达能力。一种方法,基于分数的一次性均值流策略优化(SOM),使用Q函数分数和概率流ODE构建目标速度场,通过减少训练和推理时间,在在线强化学习中实现了最先进的性能。另一项开发,随机均值流策略(SMFP),提供了一个一次性生成策略类别,通过均值流变换将噪声映射到动作,为离策略设置中稳定和探索性的策略改进提供了统一的目标。
-
DIY Star Wars BDX droid learns to walk via reinforcement learning
一位个人用户使用定制的构建重现了星球大战中的BDX机器人,该机器人利用强化学习进行运动。与迪士尼使用的工业组件相比,该项目通过使用QDD电机和改装的割草机电池供电,显著降低了成本。该机器人通过NVIDIA Isaac Lab和MuJoCo的模拟环境学习行走,在模拟环境中因保持平衡而获得奖励,因摔倒而受到惩罚,并在数小时内实现了功能性的运动。
-
asRoBallet uses friction-aware RL for zero-shot Sim2Real transfer on ballbots
Researchers have developed asRoBallet, a novel end-to-end reinforcement learning policy for a humanoid ballbot, addressing the significant sim-to-real transfer gap in robotics. The system utilizes a high-fidelity MuJoCo…
-
GLiBRL advances Deep Bayesian RL with tractable inference and better generalization
Researchers have developed GLiBRL, a novel approach for Bayesian Reinforcement Learning that enhances generalization by explicitly incorporating Bayesian task parameters. This method overcomes limitations of prior deep …
-
Researchers fix synthetic data failures in reinforcement learning policy optimization
Researchers have identified and addressed algorithmic failures in Model-Based Policy Optimization (MBPO), a technique used in reinforcement learning. The study found that MBPO can underperform compared to other methods …
-
omicro Flux robot uses Claude Code and Cranq at AI expo
An exhibition showcasing generative AI was held on May 6, 2026, at the Sunrayce Hall. The event featured a spherical robot named "omicro Flux" integrated with the MuJoCo simulator and Claude Code. Other demonstrations i…
-
New research explores ensemble models for improved AI performance and robustness
Two new research papers introduce novel methods for improving ensemble models in machine learning. The first, PACE, combines pruning and compression techniques to create more efficient and interpretable ensembles, outpe…
-
新研究探索超越全局单调性和部分观测的因果模型
研究人员开发了新的框架来理解复杂系统中的因果关系,特别是在处理非单调性和部分可观测性时。一篇论文介绍了非单调三角结构因果模型(NM-TM-SCMs),以解决全局单调性假设被违反的情况,并在模拟中展示了改进的反事实恢复能力。另一项工作提出了部分观测结构因果模型(POSCMs),用于形式化具有潜在上下文的因果系统,提供了比标准SCM更通用的方法。此外,还提出了一种基于分数的贪婪搜索方法,即潜在变量贪婪等价搜索(LGES),用于识别部分观测…
-
SAVGO algorithm uses geometry to improve reinforcement learning policy updates
Researchers have introduced SAVGO, a novel reinforcement learning algorithm designed to improve policy updates in continuous control tasks. SAVGO learns a joint state-action embedding space where similar action-value es…
-
Tsinghua University releases GS-Playground for efficient embodied AI simulation
Researchers from Tsinghua University's AIR DISCOVER Lab have developed and open-sourced GS-Playground, a novel simulation framework designed to overcome bottlenecks in visual-centric embodied AI training. The framework …
-
新的C++引擎HASE在多智能体强化学习训练中达到33M步/秒
研究人员开发了一种名为捉迷藏引擎 (HASE) 的新C++引擎,旨在显著提高在去中心化、部分可观察环境中的强化学习智能体训练效率。通过利用面向数据设计和优化的内存处理,HASE在单个智能体上实现了高达每秒3300万步的惊人吞吐量。该引擎大大缩短了多智能体策略的训练时间,使得复杂的协作行为能在几分钟内学会。
-
New activation functions boost AI plasticity in continual learning
Researchers have developed new activation functions, Smooth-Leaky and Randomized Smooth-Leaky, to address the loss of plasticity in continual learning models. These functions are designed to maintain a model's ability t…
-
Google AI unveils Nested Learning; OpenAI advances meta-learning and AI safety
Google Research has introduced "Nested Learning," a novel machine learning paradigm designed to address the challenge of catastrophic forgetting in continual learning. This approach views models as interconnected optimi…
-
OpenAI finds evolution strategies rival reinforcement learning for AI training
OpenAI researchers have found that evolution strategies (ES), a decades-old optimization technique, can rival the performance of modern reinforcement learning (RL) methods on benchmarks like Atari and MuJoCo. ES offers …