English(EN) CLQT: A Closed-Loop, Cost-Aware, Strategy-Consistent Benchmark for Diagnostic Evaluation of LLM Portfolio-Management Agents

新的CLQT基准评估LLM代理的交易策略，而非仅仅是回报

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-30 04:00

研究人员推出CLQT，一个旨在评估投资组合管理中大型语言模型（LLM）代理的新基准。与以往主要根据回报对代理进行排名的基准不同，CLQT专注于通过一个闭环、成本感知、策略一致性的交易环境来诊断代理性能。这种方法旨在评估代理的推理、策略一致性和潜在能力，而不仅仅是其短期财务成果。 AI

影响该基准可能导致对AI代理在复杂、现实世界的金融应用中进行更稳健的评估。

排序理由该集群描述了一篇介绍用于评估AI代理的新型基准的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Bo Qu, Mingguang Chen · 2026-06-30 04:00

CLQT：用于 LLM 投资组合管理代理诊断评估的闭环、成本感知、策略一致性基准

arXiv:2606.29771v1 Announce Type: new Abstract: LLM agents are increasingly cast as autonomous portfolio managers, and benchmarks have moved from financial question-answering to sequential trading. Yet most still rank agents by returns over a fixed window -- a weak proxy, since a…