PulseAugur
实时 05:12:41
实体 COCO

COCO

PulseAugur coverage of COCO — every cluster mentioning COCO across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
20
90 天内 20
发布 · 30天
0
90 天内 0
论文 · 30天
20
90 天内 20
层级分布 · 90 天
关系
情绪 · 30 天

2 天有情绪数据

最近 · 第 1/1 页 · 共 20 条
  1. RESEARCH · CL_41767 ·

    VISTA system wins Ego4D challenge with object interaction anticipation

    Researchers have developed VISTA, a novel system designed for anticipating human-object interactions in egocentric videos. VISTA integrates spatial object detection with temporal context from a frozen V-JEPA 2.1 model t…

  2. RESEARCH · CL_18576 ·

    Researchers unveil new stealthy backdoor attacks on AI models using diffusion and style features

    Researchers have developed new methods for backdoor attacks on advanced AI models, specifically targeting Vision-Language Models (VLMs) and Diffusion Models (DMs). One approach, CBV, uses diffusion models to create natu…

  3. TOOL · CL_15733 ·

    FractalMamba++ scales vision models across resolutions using Hilbert curves

    Researchers have introduced FractalMamba++, an enhanced vision backbone designed to improve the performance of Mamba-based models, particularly with high-resolution inputs. This new architecture leverages the geometric …

  4. TOOL · CL_15617 ·

    Colinearity Decay 训练 Vision Transformers 以实现更好的低比特量化

    研究人员开发了一种名为 Colinearity Decay (CD) 的新训练技术,以使 Vision Transformers (ViTs) 更易于进行低比特量化。该方法充当结构正则化器,惩罚 Transformer 块内的对齐以减轻有害的激活离群值,同时不影响架构或任务损失。CD 旨在提高量化模型的准确性,同时保持或增强全精度性能,为 ViTs 的高效部署提供了一种方法,且没有推理时间开销。

  5. RESEARCH · CL_18683 ·

    New methods improve open-vocabulary object detection robustness and adaptation

    Researchers have introduced several new methods to improve open-vocabulary object detection, a field that aims to identify arbitrary objects based on human prompts. One approach, EBOD, integrates a prompt-based detector…

  6. RESEARCH · CL_15520 ·

    Hyp2Former 使用双曲嵌入进行开放集全景分割

    研究人员开发了 Hyp2Former,一个用于开放集全景分割的新颖框架,该框架利用双曲空间中的层次语义相似性。这种方法通过编码类别之间的关系,即使没有对未知对象类型进行显式训练,也能使模型更好地区分未知对象和已知类别。在 MS COCO 和 Cityscapes 等数据集上的实证结果表明,Hyp2Former 在识别未知对象方面优于现有方法,同时保持了对已知类别的鲁棒性。

  7. RESEARCH · CL_14350 ·

    Object detection models show mixed robustness to quantization and input degradations

    A new study investigates how post-training quantization (PTQ) affects the robustness of YOLO object detection models when faced with real-world input degradations like noise and blur. Researchers evaluated various preci…

  8. RESEARCH · CL_14347 ·

    GPT-4o and other multimodal models evaluated on computer vision tasks

    A new paper evaluates how well multimodal foundation models, including GPT-4o and Gemini 1.5 Pro, perform on standard computer vision tasks. Researchers developed a prompt-chaining method to translate vision tasks into …

  9. RESEARCH · CL_14043 ·

    Flow Matching research advances efficiency, control, and applications

    Recent research explores advancements in Flow Matching, a generative modeling technique. Several papers introduce new methods to improve its efficiency, controllability, and applicability to diverse data types. Innovati…

  10. RESEARCH · CL_11804 ·

    New dataset aids computer vision identification of parasitoid wasps

    Researchers have introduced the Descriptor: Parasitoid Wasps and Associated Hymenoptera Dataset (DAPWH), a new image collection aimed at improving automated identification of crucial insect groups. The dataset comprises…

  11. RESEARCH · CL_11769 ·

    New DBAC metric measures and identifies bias amplification in image captions

    Researchers have introduced a new metric called Directional Bias Amplification in Captioning (DBAC) to measure and identify how image captioning models worsen biases present in their training data. Unlike previous metri…

  12. RESEARCH · CL_11371 ·

    研究人员提出使用模糊逻辑通过知识发现实现鲁棒图像识别

    研究人员开发了一种新颖的方法,通过将领域知识集成到深度神经网络中来增强图像识别的鲁棒性。该方法引入了一个可微分知识单元(DKU),它使用模糊逻辑和蕴含规则来调制分类器的logits,以优化类概率。该系统能够从任务监督中自动发现隐式概念,从而在不需要显式概念标签的情况下学习类与这些概念之间的关系。在PASCAL-VOC、COCO和MedMNIST数据集上的评估表明,该方法在性能和领域泛化能力方面均有所提高。

  13. RESEARCH · CL_11442 ·

    Researchers find single hub text exploits vulnerabilities in CLIP cross-modal encoders

    Researchers have identified a vulnerability in cross-modal encoders like CLIP, which map text and images into a shared embedding space. They discovered that a single "hub text" can generate high similarity scores with n…

  14. RESEARCH · CL_09738 ·

    ViCrop-Det improves small-object detection with adaptive spatial routing

    Researchers have introduced ViCrop-Det, a novel framework designed to improve small-object detection in images without requiring additional training. This method utilizes Spatial Attention Entropy (SAE) derived from a m…

  15. RESEARCH · CL_08199 ·

    New metric T3S evaluates semantic similarity in low-level image processing

    Researchers have introduced a new evaluation metric called Semantic Similarity Score (T3S) for low-level image processing tasks. This metric aims to assess whether the semantic content of an image is preserved after pro…

  16. RESEARCH · CL_20317 ·

    Diffusion models boost AI's vision for segmentation and anomaly detection

    Researchers have developed DiCLIP, a new framework for weakly supervised semantic segmentation that enhances the capabilities of CLIP by integrating diffusion models. This approach addresses CLIP's limitations in dense …

  17. RESEARCH · CL_06459 ·

    New OVD method improves object detection with hierarchical consistency and unbiased objectness

    Researchers have developed a new framework to improve open-vocabulary object detection (OVD), a technique that allows AI models to identify objects beyond their training data. The proposed method addresses inaccuracies …

  18. RESEARCH · CL_06427 ·

    New framework enhances federated cross-modal retrieval with missing modalities

    Researchers have developed RCSR, a new framework designed to improve federated cross-modal retrieval, particularly when dealing with data heterogeneity and missing modalities across clients. The system utilizes a frozen…

  19. RESEARCH · CL_06398 ·

    HalalBench benchmark tackles OCR challenges for multilingual food packaging ingredient extraction

    Researchers have introduced HalalBench, a new multilingual benchmark designed to evaluate Optical Character Recognition (OCR) performance specifically on food packaging ingredient labels. The benchmark addresses the uni…

  20. RESEARCH · CL_06173 ·

    BMD-45 dataset improves CCTV vehicle detection in developing cities

    Researchers have introduced BMD-45, a new large-scale dataset designed to improve vehicle detection in urban traffic environments found in developing cities. This dataset contains over 45,000 images with 480,000 boundin…