VideoMME
PulseAugur coverage of VideoMME — every cluster mentioning VideoMME across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
OmniAgent uses active perception for efficient video understanding · 2 sources tracked
Researchers have introduced OmniAgent, a novel omni-modal agent designed for video understanding that utilizes an iterative Observation-Thought-Action cycle based on Partially Observable Markov Decision Processes (POMDP…
-
CREST method efficiently selects key frames from long videos
Researchers have developed CREST, a novel method for efficiently selecting key frames from long videos. This training-free approach leverages the temporal geometry of query-frame relevance, specifically focusing on loca…
-
AdaFocus framework boosts long video understanding with adaptive sampling
Researchers have developed AdaFocus, a new framework designed to improve the efficiency of understanding long videos. This method avoids the high costs of dense encoding or the information loss from aggressive compressi…
-
New AI methods enhance video reasoning by structuring and selecting visual evidence
Researchers are developing new methods to improve how large vision-language models (VLMs) understand and reason about long videos. Several papers introduce techniques for more efficient frame selection and evidence gath…
-
VideoThinker framework improves lightweight MLLMs' video reasoning via causal debiasing
Researchers have developed VideoThinker, a novel framework designed to enhance the reasoning capabilities of lightweight multimodal language models (MLLMs) in video analysis. This approach addresses the issue of percept…