LLMs show surface-level risk alignment, study finds

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-04 04:00

A new research paper explores whether Large Language Models (LLMs) truly align with human decision-making mechanisms when faced with risk, using the St. Petersburg game as a testbed. While many LLMs produce human-like finite bids in the original game, this outcome-level resemblance often hides differing underlying reasoning processes. Controlled variants of the game reveal that LLMs frequently shift to conditionally rational behavior rather than maintaining human-consistent mechanisms, even after instruction tuning. AI

影响 Highlights the need for deeper evaluation of LLM decision-making beyond surface-level outcomes to ensure true alignment.

排序理由 Academic paper analyzing LLM behavior on a specific task. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Chensong Huang, Changyu Chen, Chenwei Lin, Hanjia Lyu, Xian Xu, Jiebo Luo · 2026-06-04 04:00

Probing Outcome-Level Resemblance and Mechanism-Level Alignment in LLM Risk Decisions: Evidence from the St. Petersburg Game

arXiv:2606.04978v1 Announce Type: new Abstract: LLMs can appear cautious in risk decision-making tasks, yet cautious-looking outputs do not necessarily indicate alignment with human decision-making mechanisms. We investigate this distinction using the St. Petersburg game as a con…

报道来源 [1]

Probing Outcome-Level Resemblance and Mechanism-Level Alignment in LLM Risk Decisions: Evidence from the St. Petersburg Game

相关实体

相关话题