Vortex system enhances video retrieval with multi-modal fusion · 1 source tracked

By PulseAugur Editorial · [1 sources] · 2026-06-19 04:00

The Vortex system, developed by the FocusOnFun team for the Ho Chi Minh City AI Challenge 2025, enhances intelligent video retrieval through multi-modal fusion. It integrates adaptive keyframe extraction, vision-language and speech model metadata generation, and a hybrid retrieval strategy combining CLIP and SigLIP2 embeddings. The system also features Rocchio-based relevance feedback and a multi-stage temporal search mechanism, built on Milvus and Elasticsearch for scalability. The FocusOnFun team achieved excellent performance in the competition, highlighting the effectiveness of their hybrid approach. AI

IMPACT This system advances intelligent multimedia search and temporal reasoning in video retrieval.

RANK_REASON The item is a research paper detailing a system for video retrieval. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Vortex system enhances video retrieval with multi-modal fusion · 1 source tracked

COVERAGE [1]

arXiv cs.CV TIER_1 Dansk(DA) · Duc-Tho Nguyen, Hieu-Hoc Tran-Minh, Khanh-Hoa Lam, Hoang-Nhut Ly, Huu-Phuc Huynh, Thanh-Tien Tran, Trung-Nghia Le · 2026-06-19 04:00

Vortex: Multi-Modal Fusion System for Intelligent Video Retrieval

arXiv:2606.19682v1 Announce Type: new Abstract: This paper presents Vortex, the multimodal video retrieval system developed by our team, FocusOnFun, for the Ho Chi Minh City AI Challenge 2025, designed to advance intelligent multimedia search and temporal reasoning. The system in…

COVERAGE [1]

Vortex: Multi-Modal Fusion System for Intelligent Video Retrieval

RELATED ENTITIES

RELATED TOPICS