A Hybrid Vision-Language Architecture for Automated Defect Reasoning and Report Generation in Industrial Inspection
Researchers have developed a novel hybrid architecture for automated industrial inspection, specifically for wind turbine blade maintenance. This system integrates a vision model for defect localization with a language model for report generation, decoupling these tasks for improved efficiency and accuracy. The architecture utilizes a YOLO26-x-obb detector, a custom encoding module, and a 4-bit quantized Qwen-2.5-1.5B model fine-tuned with synthetic data and retrieval augmentation. AI
IMPACT This hybrid architecture demonstrates the effectiveness of specialized, decoupled models over monolithic VLMs for structured generation tasks in industrial settings.