CapStARE: Capsule-based Sequential Architecture for Robust and Efficient Gaze Estimation
Researchers have developed CapStARE, a novel capsule-based architecture for gaze estimation. This system utilizes a frozen ConvNeXt backbone for efficient feature extraction and capsule formation with attention-based routing for structured facial reasoning. It employs dual GRU decoders for lightweight sequential modeling, achieving real-time inference speeds and strong performance on benchmark datasets like ETH-XGaze and MPIIFaceGaze. AI
IMPACT This new architecture offers a practical and robust framework for real-time gaze estimation, potentially improving human-computer interaction and robotics applications.