FLOWREADER: Min-Cost Flow Optimization for Multi-Modal Long Document Q&A
Researchers have developed FLOWREADER, a novel method for question answering over long, multimodal documents. This approach reframes evidence assembly as a min-cost flow problem, enabling better handling of fragmented information across text, tables, and slides. FLOWREADER outperforms traditional top-k retrieval methods on specific subsets of the VisDoMBench benchmark, demonstrating its effectiveness in complex evidence assembly scenarios. AI
IMPACT Introduces a novel approach to multimodal Q&A, potentially improving performance on complex documents.