MM-Matryoshka: Towards Budget-Elastic Visual Document Retrieval via a 2D Multimodal Matryoshka Training Framework
Researchers have introduced MM-Matryoshka, a novel 2D training framework designed to make visual document retrieval more budget-elastic. This approach allows a single model to adapt its retrieval accuracy based on available computational resources, by selecting a flexible budget for vector width and encoder depth. Experiments show that MM-Matryoshka significantly reduces storage and computational overhead compared to existing methods while maintaining high-quality retrieval. AI
IMPACT Enables more efficient deployment of visual document retrieval systems by allowing dynamic adjustment of computational resources.