LLM reasoning models fail behavioral simulation in multi-agent negotiation

By PulseAugur Editorial · [1 sources] · 2026-05-07 04:00

A new research paper explores the mismatch between reasoning capabilities and behavioral simulation in large language models used for multi-agent negotiation. The study found that models like DeepSeek and OpenAI's GPT-5.2, when used for their reasoning abilities, often defaulted to authority-driven outcomes rather than negotiated ones. The paper suggests that evaluating models based on their intended behavioral role, rather than just strategic capability, is crucial for accurate institutional simulations. AI

IMPACT Highlights the need to evaluate LLMs for specific behavioral roles in simulations, not just raw strategic capability.

RANK_REASON The cluster contains an arXiv paper detailing research findings on LLM behavior. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.LG TIER_1 English(EN) · Sandro Andric · 2026-05-07 04:00

When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation

arXiv:2604.11840v2 Announce Type: replace Abstract: Behavioral simulation and strategic problem solving are different tasks. Large language models are increasingly explored as agents in policy-facing institutional simulations, but stronger reasoning need not improve behavioral sa…

COVERAGE [1]

When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation

RELATED ENTITIES

RELATED TOPICS