Researchers have developed a novel two-stage framework for gaze target estimation that explicitly incorporates object semantics. This approach moves beyond traditional pixel-level regression by first encoding object-level representations to align image features with distinct semantic entities. The method then employs multi-scale feature fusion and geometric constraints from head pose and gaze direction to achieve more stable and semantically consistent predictions, particularly in complex scenes. Experiments on several benchmarks, including GazeFollow and GOO-Real, show competitive performance with a compact model size. AI
IMPACT This research could lead to more intuitive human-computer interaction by improving the accuracy and stability of gaze tracking systems.
RANK_REASON This is a research paper detailing a new method for gaze estimation. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →