PulseAugur
EN
LIVE 19:13:08

LLM reasoning models fail behavioral simulation in multi-agent negotiation

A new research paper explores the mismatch between reasoning capabilities and behavioral simulation in large language models used for multi-agent negotiation. The study found that models like DeepSeek and OpenAI's GPT-5.2, when used for their reasoning abilities, often defaulted to authority-driven outcomes rather than negotiated ones. The paper suggests that evaluating models based on their intended behavioral role, rather than just strategic capability, is crucial for accurate institutional simulations. AI

IMPACT Highlights the need to evaluate LLMs for specific behavioral roles in simulations, not just raw strategic capability.

RANK_REASON The cluster contains an arXiv paper detailing research findings on LLM behavior. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

LLM reasoning models fail behavioral simulation in multi-agent negotiation

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Sandro Andric ·

    When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation

    arXiv:2604.11840v2 Announce Type: replace Abstract: Behavioral simulation and strategic problem solving are different tasks. Large language models are increasingly explored as agents in policy-facing institutional simulations, but stronger reasoning need not improve behavioral sa…