Per-pixel bounding-box regression + DBSCAN for handwritten word detection - visual walkthrough of WordDetectorNet [P]
A new approach to handwritten word detection, called WordDetectorNet, uses per-pixel bounding-box regression combined with DBSCAN clustering. Instead of traditional methods like anchor-based detection and Non-Maximum Suppression, this model classifies each pixel as a "word pixel" and regresses distances to its bounding box. Thousands of overlapping candidate boxes are then clustered using DBSCAN with a 1-IoU distance metric, and the median box per cluster is selected as the final detection. AI
IMPACT Introduces a novel approach to object detection that could influence future computer vision models.