PulseAugur
实时 23:40:44

LLM reasoning models fail behavioral simulation in multi-agent negotiation

A new research paper explores the mismatch between reasoning capabilities and behavioral simulation in large language models used for multi-agent negotiation. The study found that models like DeepSeek and OpenAI's GPT-5.2, when used for their reasoning abilities, often defaulted to authority-driven outcomes rather than negotiated ones. The paper suggests that evaluating models based on their intended behavioral role, rather than just strategic capability, is crucial for accurate institutional simulations. AI

影响 Highlights the need to evaluate LLMs for specific behavioral roles in simulations, not just raw strategic capability.

排序理由 The cluster contains an arXiv paper detailing research findings on LLM behavior. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

LLM reasoning models fail behavioral simulation in multi-agent negotiation

报道来源 [1]

  1. arXiv cs.LG TIER_1 English(EN) · Sandro Andric ·

    When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation

    arXiv:2604.11840v2 Announce Type: replace Abstract: Behavioral simulation and strategic problem solving are different tasks. Large language models are increasingly explored as agents in policy-facing institutional simulations, but stronger reasoning need not improve behavioral sa…