PulseAugur
实时 04:59:01

OneTrackerV2 unifies multimodal visual tracking with Dual Mixture-of-Experts

Researchers have developed a new event-based visual object tracking framework that addresses limitations of existing methods by explicitly modeling event density variations across multiple temporal scales. This approach injects sparse, medium-density, and dense event search regions into a Vision Transformer backbone for hierarchical feature learning. Additionally, a sparsity-aware Mixture-of-Experts module and a dynamic pondering strategy are introduced to enhance specialization and adapt inference depth based on tracking difficulty, showing favorable accuracy-efficiency trade-offs on benchmark datasets. AI

影响 Introduces novel techniques for event-based visual tracking, potentially improving performance in challenging conditions.

排序理由 The cluster contains two distinct arXiv papers detailing novel research in computer vision and object tracking.

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。 我们如何撰写摘要 →

OneTrackerV2 unifies multimodal visual tracking with Dual Mixture-of-Experts

报道来源 [5]

  1. Hugging Face Daily Papers TIER_1 English(EN) ·

    Unified Multimodal Visual Tracking with Dual Mixture-of-Experts

    Multimodal visual object tracking can be divided into to several kinds of tasks (e.g. RGB and RGB+X tracking), based on the input modality. Existing methods often train separate models for each modality or rely on pretrained models to adapt to new modalities, which limits efficie…

  2. arXiv cs.CV TIER_1 English(EN) · Shiao Wang, Xiao Wang, Duoqing Yang, Wenhao Zhang, Bo Jiang, Lin Zhu, Yonghong Tian, Bin Luo ·

    Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

    arXiv:2605.06112v1 Announce Type: new Abstract: Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as low illumination and fast motion. Event cameras offer a promising alternative by asynchronously capturing pixel-wise brigh…

  3. arXiv cs.CV TIER_1 English(EN) · Bin Luo ·

    Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

    Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as low illumination and fast motion. Event cameras offer a promising alternative by asynchronously capturing pixel-wise brightness changes, providing high dynamic range and …

  4. arXiv cs.CV TIER_1 English(EN) · Lingyi Hong, Jinglun Li, Xinyu Zhou, Kaixun Jiang, Pinxue Guo, Zhaoyu Chen, Runze Li, Xingdong Sheng, Wenqiang Zhang ·

    Unified Multimodal Visual Tracking with Dual Mixture-of-Experts

    arXiv:2605.03716v1 Announce Type: new Abstract: Multimodal visual object tracking can be divided into to several kinds of tasks (e.g. RGB and RGB+X tracking), based on the input modality. Existing methods often train separate models for each modality or rely on pretrained models …

  5. arXiv cs.CV TIER_1 English(EN) · Wenqiang Zhang ·

    Unified Multimodal Visual Tracking with Dual Mixture-of-Experts

    Multimodal visual object tracking can be divided into to several kinds of tasks (e.g. RGB and RGB+X tracking), based on the input modality. Existing methods often train separate models for each modality or rely on pretrained models to adapt to new modalities, which limits efficie…