FreeRet framework turns multimodal LLMs into training-free retrievers

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed FreeRet, a novel framework that enables multimodal large language models (MLLMs) to function as effective retrievers without requiring additional training. This plug-and-play system extracts semantically grounded embeddings from off-the-shelf MLLMs for initial candidate search and then utilizes their reasoning capabilities for precise reranking. FreeRet demonstrates significant performance improvements over models trained on millions of pairs on the MMEB and MMEB-V2 benchmarks, showcasing its potential to unify retrieval, reranking, and generation within a single model. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Enables MLLMs to act as powerful, training-free retrievers, potentially simplifying RAG systems and enhancing multimodal search capabilities.

RANK_REASON This is a research paper describing a new framework for MLLMs.

Read on arXiv cs.CV →

COVERAGE [1]

arXiv cs.CV TIER_1 · Yuhan Zhu, Xiangyu Zeng, Chenting Wang, Xinhao Li, Chunxu Liu, Yicheng Xu, Ziang Yan, Yi Wang, Limin Wang · 2026-05-04 04:00

FreeRet: MLLMs as Training-Free Retrievers

arXiv:2509.24621v2 Announce Type: replace Abstract: Multimodal large language models (MLLMs) are emerging as versatile foundations for mixed-modality retrieval. Yet, they often require heavy post-hoc training to convert them into contrastive encoders for retrieval. This work asks…

COVERAGE [1]

FreeRet: MLLMs as Training-Free Retrievers

RELATED ENTITIES

RELATED TOPICS