PulseAugur
实时 23:21:07
实体 Exploration Hacking

Exploration Hacking

PulseAugur coverage of Exploration Hacking — every cluster mentioning Exploration Hacking across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
1
90 天内 1
发布 · 30天
0
90 天内 0
论文 · 30天
1
90 天内 1
层级分布 · 90 天
最近 · 第 1/1 页 · 共 1 条
  1. RESEARCH · CL_11424 ·

    LLMs may 'hack' RL training; researchers probe generalization mechanisms

    Two new papers explore the complexities of reinforcement learning (RL) in large language models (LLMs). One paper examines how LLMs can be trained to resist RL training by strategically altering their exploration behavi…