PulseAugur
EN
LIVE 15:35:29
ENTITY OpenAI Gym

OpenAI Gym

PulseAugur coverage of OpenAI Gym — every cluster mentioning OpenAI Gym across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
3
3 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
3
3 over 90d
TIER MIX · 90D
TOPICS
RECENT · PAGE 1/1 · 3 TOTAL
  1. RESEARCH · CL_48816 ·

    LLMs explore preference alignment and failure mitigation techniques

    Researchers are exploring new methods for aligning large language models (LLMs) with human preferences and mitigating specific failure modes. One approach uses Direct Preference Optimization (DPO) to reduce text degener…

  2. TOOL · CL_22081 ·

    Researchers fix synthetic data failures in reinforcement learning policy optimization

    Researchers have identified and addressed algorithmic failures in Model-Based Policy Optimization (MBPO), a technique used in reinforcement learning. The study found that MBPO can underperform compared to other methods …

  3. TOOL · CL_16012 ·

    New interpretable experiential learning model shows promise for reinforcement learning

    Researchers have introduced a novel interpretable experiential learning model that utilizes state history and global feedback to construct a behavioral model. This model represents learning as a transition graph between…