Brief · PulseAugur

TOOL · Hugging Face Daily Papers English(EN) · 2w

CNNs, Transformers, Hybrid, and Vision Language Models for Skin Cancer Detection

Researchers have conducted a comprehensive evaluation of twelve deep learning models for detecting skin cancer using a unified approach on the PAD-UFES-20 dataset. The study compared convolutional neural networks (CNNs), vision transformers (ViTs), hybrid models, and vision-language models (VLMs). While well-tuned CNNs offer a solid baseline, transformer-based architectures generally showed superior discrimination capabilities. Hybrid models and a SigLIP-based VLM achieved the best overall performance, providing practical insights for real-world deployment in skin cancer screening. AI

IMPACT Provides practical guidance on selecting deep learning models for real-world skin cancer screening applications.

Transformers
Vision Language Models
CNNs
SigLIP
PAD-UFES-20 dataset
MaxViT Tiny
CoAtNet0