Reinforcing Human Behavior Simulation via Verbal Feedback
Two new research papers explore the limitations of current large language models in simulating realistic human behavior. The first paper, "OmniBehavior," introduces a benchmark using real-world data and finds that LLMs tend to exhibit a positive, homogenized bias, failing to capture individual differences. The second paper, "DITTO," proposes a reinforcement learning approach that incorporates verbal feedback to improve LLM simulation capabilities, showing significant gains over base models and outperforming GPT-5.4 on several benchmarks. AI
IMPACT New benchmarks and RL techniques highlight LLM limitations in simulating diverse human behaviors, indicating a need for more nuanced training data and feedback mechanisms.