MMLongBench-Doc
PulseAugur coverage of MMLongBench-Doc — every cluster mentioning MMLongBench-Doc across labs, papers, and developer communities, ranked by signal.
4 天有情绪数据
-
EviProp method improves long document retrieval with graph diffusion
Researchers have developed EviProp, a novel method for retrieving relevant pages from long, visually rich documents. Unlike existing approaches that score pages independently, EviProp models documents as multimodal Chun…
-
New CDS method advances multimodal document question answering
Researchers have developed a new retrieval method called Constrained Dominant Sets (CDS) for multimodal document question answering. This technique addresses limitations in current systems that struggle with long docume…
-
MARDoc框架通过结构化记忆增强多模态长文档问答能力
研究人员推出了一种新颖的框架MARDoc,旨在改进长篇多模态文档的问答能力。该系统使用三个专门的代理:用于检索的Explorer,用于将交互处理成结构化记忆的Refiner,以及用于反馈的Reflector。通过采用动态结构化记忆而非持续增长的上下文,MARDoc旨在减少噪声并保留关键信息,以实现更有效的多跳推理。
-
具备视觉能力的LLM与OCR在文档问答方面进行测试
一项基准测试将具备视觉能力的大型语言模型与基于OCR的管道在长篇、富含图像的文档问答方面进行了比较。评估使用了MMLongBench-Doc数据集中的30个PDF文件,评估了模型解释文档中图表、图像和表格的能力。结果突显了每种方法在处理复杂视觉信息进行文档问答方面的优缺点。