Video-MLLMs
PulseAugur coverage of Video-MLLMs — every cluster mentioning Video-MLLMs across labs, papers, and developer communities, ranked by signal.
4 day(s) with sentiment data
-
MotionAtlas system offers detailed region captioning for videos
Researchers have introduced MotionAtlas, a novel system designed for detailed captioning of motion-centric videos. This system includes a new benchmark dataset with 2,073 multiple-choice questions, a scalable pipeline f…
-
New SER method enhances Video MLLM reasoning with semantic evidence rewards · 4 sources tracked
Researchers have developed a new method called Semantic Evidence Reward (SER) to improve the spatio-temporal reasoning capabilities of Video Multimodal Large Language Models (Video MLLMs). Existing models often struggle…
-
New CARE framework optimizes reasoning length in video-MLLMs
Researchers have introduced CARE, a novel framework designed to optimize reasoning length in multimodal video models. This competence-aware reward shaping approach adapts the model's training by shifting its preference …
-
New CF-GRPO framework enhances video reasoning in multimodal LLMs
Researchers have introduced Consensus Frame GRPO (CF-GRPO), a novel reward framework designed to enhance the reasoning capabilities of video multimodal large language models (Video-MLLMs). This framework operates withou…
-
FCMBench-Video benchmark evaluates document understanding in videos for AI models
Researchers have introduced FCMBench-Video, a new benchmark designed to evaluate the capabilities of Video-Multimodal Large Language Models (Video-MLLMs) in understanding documents presented in video format. This benchm…