MLLMs
PulseAugur coverage of MLLMs — every cluster mentioning MLLMs across labs, papers, and developer communities, ranked by signal.
- instance of Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond 90%
- used by Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond 70%
- used by train of thought 70%
- used by Standard Chinese 70%
- used by Chain Of Thought 70%
- used by English 60%
- 2026-05-22 research_milestone A new pipeline was introduced to enhance MLLMs for safety-critical driving video analysis. source
- 2026-05-22 research_milestone Researchers reveal and propose a method to recover temporal grounding in multimodal large language models. source
- 2026-05-22 research_milestone A new benchmark and dataset were introduced to evaluate MLLMs' ability to reason about personality beyond superficial cues. source
- 2026-05-21 research_milestone A new method using MLLMs for detecting AI-generated Chinese poetry achieves state-of-the-art results. source
18 day(s) with sentiment data
-
New benchmark and reasoning method improve AI understanding of sports videos
Researchers have introduced SportsTime, a new benchmark dataset designed for evaluating multimodal large language models (MLLMs) on understanding long-form sports videos. The dataset includes over 14,000 question-answer…
-
MLLMs struggle with egocentric pointing, new benchmark EgoPoint-Bench reveals
Researchers have developed EgoPoint-Bench, a new benchmark designed to test how well multimodal large language models (MLLMs) understand pointing gestures in egocentric vision. Current MLLMs often fail to accurately int…
-
Air-Know network tackles composed image retrieval with novel expert-proxy-diversion paradigm
Researchers have introduced Air-Know, a novel network designed to tackle the Composed Image Retrieval (CIR) challenge, specifically addressing the Noisy Triplet Correspondence (NTC) problem. Existing methods struggle wi…