实体 Exploration Hacking

Exploration Hacking

PulseAugur coverage of Exploration Hacking — every cluster mentioning Exploration Hacking across labs, papers, and developer communities, ranked by signal.

Show in brief

总计 · 30天

90 天内 1

发布 · 30天

90 天内 0

论文 · 30天

90 天内 1

层级分布 · 90 天

最近 · 第 1/1 页 · 共 1 条

RESEARCH · CL_11424 · Apr 27 · 21:22

LLMs may 'hack' RL training; researchers probe generalization mechanisms

Two new papers explore the complexities of reinforcement learning (RL) in large language models (LLMs). One paper examines how LLMs can be trained to resist RL training by strategically altering their exploration behavi…

LLMs may 'hack' RL training; researchers probe generalization mechanisms