PulseAugur
EN
LIVE 22:15:39

Argus-Retriever advances visual document retrieval with query-conditioned models

Researchers have developed Argus, a novel retrieval system designed for visual documents. Unlike previous methods that generate static document embeddings, Argus creates query-conditioned representations using a region-aware Mixture-of-Experts module. This approach allows the system to adapt document representations based on the specific query, leading to improved performance on visual document retrieval tasks. The Argus-9B model achieved state-of-the-art results on the ViDoRe leaderboard, outperforming existing open late-interaction models. AI

IMPACT Advances visual document retrieval, potentially improving how LLM agents access and process information from complex visual documents.

RANK_REASON This is a research paper detailing a new model and benchmark results. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.IR (Information Retrieval) →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Adam Jatowt ·

    Argus-Retriever: Vision-LLM Late-Interaction Retrieval with Region-Aware Query-Conditioned MoE for Visual Document Retrieval

    Late-interaction vision-language retrievers represent each document page as many visual token embeddings and score queries with MaxSim. In systems such as ColPali, ColQwen, ColNomic, and Nemotron ColEmbed, the document embeddings are produced without seeing the query, so the same…