新的LEDGER基准测试LLM在长上下文财务报告分析方面的能力

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-11 09:28

研究人员推出了LEDGER，这是一个新的基准数据集，旨在评估大型语言模型在财务检索和提取方面的长上下文能力。该数据集包含4,999份数字化的公司年度报告，包含图表、表格和叙述性文本，超越了简化的监管文件。LEDGER包含三个不同的评估基准，从页面级别的KPI检索到对话式查找和完整的KPI提取，所有这些都源自数字密集、篇幅较长的报告。该项目还提供了人工标注的数据和一套全面的提取、验证和评分工具链，并通过CEO信件修辞和市场影响的案例研究展示了其效用。 AI

影响该基准将能够更严格地评估LLM处理和提取长篇财务文档信息的能力。

排序理由该集群描述了一篇介绍用于评估LLM的基准数据集的新学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Charles Moslonka, Amaury de Vitry, Arthur Garnier, Hicham Randrianarivo, Emmanuel Malherbe · 2026-06-12 04:00

LEDGER：一个用于金融检索和提取的长上下文企业年报基准

arXiv:2606.13100v1 Announce Type: new Abstract: Finance reporting is a natural proving ground for large language models, and the very-long-context capabilities of recent models across all sizes make rigorous evaluation in this domain an increasingly pressing need. Yet most public…
arXiv cs.CL TIER_1 English(EN) · Emmanuel Malherbe · 2026-06-11 09:28

LEDGER：企业年报长上下文基准，用于基于事实的财务检索与提取

Finance reporting is a natural proving ground for large language models, and the very-long-context capabilities of recent models across all sizes make rigorous evaluation in this domain an increasingly pressing need. Yet most public financial resources reduce the task to plain-te…

报道来源 [2]

LEDGER：一个用于金融检索和提取的长上下文企业年报基准

LEDGER：企业年报长上下文基准，用于基于事实的财务检索与提取

相关实体

相关话题