PulseAugur
实时 10:21:59

ViewSAM model uses foundation models for weakly supervised cross-view object tracking

Researchers have developed ViewSAM, a novel framework for weakly supervised Cross-view Referring Multi-Object Tracking (CRMOT). This approach leverages foundation models like SAM2 and SAM3 to generate pseudo-supervision, reducing the need for costly frame-level annotations. ViewSAM explicitly models view-aware cross-modal semantics, enabling robust tracking across different camera perspectives with minimal additional parameters. AI

影响 Introduces a more efficient method for multi-object tracking across camera views by reducing reliance on extensive annotations.

排序理由 The cluster contains a research paper detailing a new model and framework for a specific computer vision task.

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

ViewSAM model uses foundation models for weakly supervised cross-view object tracking

报道来源 [2]

  1. arXiv cs.CV TIER_1 English(EN) · Jiawei Ge, Xintian Zhang, Jiuxin Cao, Bo Liu, Fabian Deuser, Chang Liu, Gong Wenkang, Siyou Li, Juexi Shao, Wenqing Wu, Chen Feng, Ioannis Patras ·

    ViewSAM: Learning View-aware Cross-modal Semantics for Weakly Supervised Cross-view Referring Multi-Object Tracking

    arXiv:2605.02638v1 Announce Type: new Abstract: Cross-view Referring Multi-Object Tracking (CRMOT) aims to track multiple objects specified by natural language across multiple camera views, with globally consistent identities. Despite recent progress, existing methods rely heavil…

  2. arXiv cs.CV TIER_1 English(EN) · Ioannis Patras ·

    ViewSAM: Learning View-aware Cross-modal Semantics for Weakly Supervised Cross-view Referring Multi-Object Tracking

    Cross-view Referring Multi-Object Tracking (CRMOT) aims to track multiple objects specified by natural language across multiple camera views, with globally consistent identities. Despite recent progress, existing methods rely heavily on costly frame-level spatial annotations and …