An Open-Source Two-Stage Computer Vision Pipeline for Fine-Grained Vehicle Classification using Vision Transformers
Researchers have developed an open-source, two-stage computer vision pipeline for fine-grained vehicle classification, specifically designed to assess injury risk to cyclists. The system combines a pre-trained RT-DETR detector with a fine-tuned Vision Transformer (ViT-Base/16) to categorize vehicles into six types. It achieved high accuracy on in-distribution data and demonstrated robustness on out-of-distribution datasets, incorporating a confidence-based abstention mechanism to handle uncertainty. AI
IMPACT Provides a robust, open-source tool for analyzing traffic video, potentially improving road safety research and urban planning.