新框架推进开放词汇目标检测能力 · 追踪3个来源

作者 PulseAugur 编辑部 · [4 个来源] · 2026-06-24 04:00

研究人员开发了新的开放词汇目标检测方法，旨在识别训练期间未见过的类别的目标。一种方法 3F-OVD 引入了一个新的细粒度开放词汇检测任务和数据集 (NEU-171K)，需要对图像细节和字幕有更深入的理解。另一种方法 MSPL 采用多步伪标签，将场景理解分解为定位、识别和关联步骤，以提高在复杂场景下的准确性。第三个框架利用 CLIP 进行目标分割和识别，展示了强大的性能，并探索了独立于 CLIP 的编码作为一种替代方案。 AI

影响这些进展推动了目标识别的界限，使 AI 系统能够在各种视觉环境中识别和理解更广泛的目标。

排序理由三篇不同的研究论文介绍了用于开放词汇目标检测的新方法和数据集。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。我们如何撰写摘要 →

报道来源 [4]

arXiv cs.CV TIER_1 English(EN) · Qijun Chen · 2026-06-24 07:06

TACO: Towards Task-Consistent Open-Vocabulary Adaptation in Video Recognition

Adapting CLIP for open-vocabulary video recognition necessitates a delicate balance between newly acquired video knowledge and the pretrained generalization. While existing studies pursue this generalization-specialization trade-off with additional regularizations or constraints,…
arXiv cs.CV TIER_1 English(EN) · Ying Liu, Yijing Hua, Haojiang Chai, Yanbo Wang, TengQi Ye · 2026-06-24 04:00

Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark

arXiv:2503.14862v3 Announce Type: replace Abstract: Open-vocabulary detectors are proposed to locate and recognize objects in novel classes. However, variations in vision-aware language vocabulary data used for open-vocabulary learning can lead to unfair and unreliable evaluation…
arXiv cs.CV TIER_1 English(EN) · Hojun Choi, Youngsun Lim, Jaeyo Shin, Hyunjung Shim · 2026-06-24 04:00

MSPL：用于开放词汇对象检测的多步伪标签

arXiv:2510.14792v4 Announce Type: replace Abstract: Open-vocabulary object detection (OVD) aims to recognize and localize object categories beyond the training set. Recent approaches leverage vision-language models to generate pseudo-labels using image-text alignment, allowing de…
arXiv cs.CV TIER_1 English(EN) · Wei Yu Chen, Ying Dai · 2026-06-24 04:00

一种用于CLIP的开放词汇多目标识别的新颖框架

arXiv:2603.05962v2 Announce Type: replace Abstract: To address the limitations of existing open-vocabulary object recognition methods, including high system complexity, substantial training costs, and limited generalization capability, this paper proposes a novel Open-Vocabulary …

报道来源 [4]

TACO: Towards Task-Consistent Open-Vocabulary Adaptation in Video Recognition

Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark

MSPL：用于开放词汇对象检测的多步伪标签

一种用于CLIP的开放词汇多目标识别的新颖框架

相关实体

相关话题