Brief · PulseAugur

RESEARCH · arXiv cs.CL English(EN) · 6d · [2 sources]

Reinforcing Human Behavior Simulation via Verbal Feedback

Two new research papers explore the limitations of current large language models in simulating realistic human behavior. The first paper, "OmniBehavior," introduces a benchmark using real-world data and finds that LLMs tend to exhibit a positive, homogenized bias, failing to capture individual differences. The second paper, "DITTO," proposes a reinforcement learning approach that incorporates verbal feedback to improve LLM simulation capabilities, showing significant gains over base models and outperforming GPT-5.4 on several benchmarks. AI

IMPACT New benchmarks and RL techniques highlight LLM limitations in simulating diverse human behaviors, indicating a need for more nuanced training data and feedback mechanisms.

GPT-5.4
SOUL
DITTO
Large Language Models
OmniBehavior