PicoSAM3: Real-Time In-Sensor Region-of-Interest Segmentation
Researchers have developed PicoSAM3, a new lightweight segmentation model designed for real-time execution on edge devices and even directly on image sensors. This model, with 1.3 million parameters, utilizes a dense CNN architecture and incorporates techniques like region of interest prompt encoding and knowledge distillation from larger models. PicoSAM3 achieves strong performance on benchmarks like COCO and LVIS, and its quantized version can perform inference in under 12 milliseconds on the Sony IMX500 sensor, meeting its operational constraints. AI
IMPACT Enables real-time, privacy-preserving visual processing directly on edge devices and sensors.