Researchers have developed DualStreamHybrid, a novel two-stream framework for video action recognition that utilizes heterogeneous backbones for RGB and optical flow data. This approach assigns a Vision Transformer (ViT-Tiny/16) to RGB frames and a MobileNetV2 to optical flow, acknowledging their distinct properties. The framework was evaluated on the UCF11 and UCF50 datasets, with cross-attention and weighted fusion strategies showing promising results, achieving up to 98.12% accuracy on UCF11. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a novel architecture for video action recognition that may improve performance on complex motion and appearance tasks.
RANK_REASON This is a research paper introducing a new framework for video action recognition.