PulseAugur
EN
LIVE 21:19:33
ENTITY MLLMs

MLLMs

PulseAugur coverage of MLLMs — every cluster mentioning MLLMs across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
96
96 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
96
96 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
TIMELINE
  1. 2026-05-22 research_milestone A new pipeline was introduced to enhance MLLMs for safety-critical driving video analysis. source
  2. 2026-05-22 research_milestone Researchers reveal and propose a method to recover temporal grounding in multimodal large language models. source
  3. 2026-05-22 research_milestone A new benchmark and dataset were introduced to evaluate MLLMs' ability to reason about personality beyond superficial cues. source
  4. 2026-05-21 research_milestone A new method using MLLMs for detecting AI-generated Chinese poetry achieves state-of-the-art results. source
SENTIMENT · 30D

18 day(s) with sentiment data

RECENT · PAGE 3/5 · 96 TOTAL
  1. TOOL · CL_38243 ·

    New CrossView Suite enhances multimodal models' spatial reasoning

    Researchers have introduced the CrossView Suite, a comprehensive framework designed to enhance the spatial reasoning capabilities of multimodal large language models (MLLMs). This suite addresses limitations in cross-vi…

  2. RESEARCH · CL_37979 ·

    New image tokenization methods boost MLLM performance

    Two new research papers propose novel methods for tokenizing images to improve multimodal large language models (MLLMs). The first paper, VFMTok, uses a frozen vision foundation model as a tokenizer, achieving significa…

  3. RESEARCH · CL_43941 ·

    New architectures enable real-time video understanding

    Researchers are developing new methods for real-time video understanding, moving beyond traditional offline analysis. Several papers propose architectures that decouple visual perception from language generation to impr…

  4. TOOL · CL_36926 ·

    New benchmark reveals MLLMs struggle with spatial reasoning

    Researchers have developed PCSR-Bench, a new benchmark designed to evaluate the spatial reasoning capabilities of Multimodal Large Language Models (MLLMs) when processing omnidirectional images. The benchmark, comprisin…

  5. TOOL · CL_27571 ·

    New benchmark EgoMemReason tests AI memory in week-long videos

    Researchers have introduced EgoMemReason, a new benchmark designed to test the memory capabilities of multimodal large language models (MLLMs) and agentic frameworks in understanding long-horizon egocentric videos. The …

  6. TOOL · CL_22498 ·

    New metric evaluates MLLMs for logical consistency without annotations

    Researchers have introduced a new metric, VL-LCM, to evaluate the logical consistency of multimodal large language models (MLLMs) without requiring ground-truth annotations. This metric assesses the cause-effect reasoni…

  7. RESEARCH · CL_22492 ·

    AI research highlights challenges in cross-cultural and non-English language model development

    Two new research papers highlight challenges in developing AI for non-English languages and cultures. One paper reflects on two decades of building Arabic NLP resources, concluding that social and institutional factors …

  8. TOOL · CL_22465 ·

    New research reveals MLLM jailbreaks exploit reconstruction-concealment tradeoff

    Researchers have identified a critical tradeoff in multimodal large language models (MLLMs) related to how harmful queries are concealed and reconstructed. They found that existing methods for transforming harmful input…

  9. TOOL · CL_22437 ·

    Visual Para-Thinker introduces parallel reasoning to multimodal LLMs

    Researchers have introduced Visual Para-Thinker, a novel framework for parallel reasoning in multimodal large language models (MLLMs). This approach shifts from vertical scaling of reasoning depth to a parallel strategy…

  10. TOOL · CL_22420 ·

    New SOW method uses MLLMs to improve image generation coherence

    Researchers have introduced Selective One-Way Diffusion (SOW), a novel approach to image generation that reframes diffusion models for improved contextual coherence. SOW utilizes Multimodal Large Language Models (MLLMs)…

  11. TOOL · CL_22405 ·

    MLLMs enable training-free dense hand contact estimation, outperforming supervised methods

    Researchers have developed ContactPrompt, a novel training-free method for dense hand contact estimation that utilizes multi-modal large language models (MLLMs). This approach addresses challenges in encoding 3D hand ge…

  12. RESEARCH · CL_21787 ·

    New MedHorizon benchmark tests AI's ability to understand long medical videos

    Researchers have introduced MedHorizon, a new benchmark designed to test multimodal large language models (MLLMs) on understanding long-form medical videos. This benchmark includes 759 hours of clinical procedures and 1…

  13. TOOL · CL_20778 ·

    Vision-EKIPL framework boosts MLLM visual reasoning with external knowledge infusion

    Researchers have introduced Vision-EKIPL, a novel reinforcement learning framework designed to enhance visual reasoning in Multimodal Large Language Models (MLLMs). This approach incorporates high-quality actions genera…

  14. TOOL · CL_18628 ·

    New MSEarth benchmark uses MLLMs for Earth science discovery

    Researchers have developed MSEarth, a new multimodal benchmark designed to evaluate the capabilities of multimodal large language models (MLLMs) in Earth science reasoning. This dataset comprises over 289,000 figures wi…

  15. RESEARCH · CL_18678 ·

    New VQA methods enhance explainability and knowledge integration for multimodal LLMs

    Researchers have developed CoExVQA, a new framework for Document Visual Question Answering (DocVQA) that enhances explainability by breaking down the reasoning process. This method first identifies relevant evidence, th…

  16. RESEARCH · CL_18700 ·

    MLLMs show promise in analyzing seizure movements, outperforming traditional models

    A pilot study explored the use of multimodal large language models (MLLMs) for analyzing pathological movements in seizure videos. The research found that MLLMs, without specific training, outperformed traditional compu…

  17. RESEARCH · CL_21948 ·

    New AI unlearning methods balance data removal with model utility

    Researchers have developed new methods for machine unlearning, a process that removes specific data from AI models without full retraining. One approach, SHRED, uses self-distillation and logit demotion to identify and …

  18. TOOL · CL_15945 ·

    New In-Prompt Process Supervision framework enhances MLLMs for video moderation

    Researchers have developed a new framework called IPS (In-Prompt Process Supervision) to enhance the accuracy of multimodal large language models (MLLMs) in content moderation for short videos. This method incorporates …

  19. TOOL · CL_15707 ·

    Researchers use RL to improve MLLM regression on imbalanced data

    Researchers have developed a new framework to improve how multimodal large language models (MLLMs) handle numerical regression tasks, particularly those with imbalanced data distributions. Existing training methods ofte…

  20. RESEARCH · CL_15670 ·

    New HERMES and DSCache methods improve streaming video understanding with KV cache

    Researchers have developed new methods to improve the efficiency of multimodal large language models (MLLMs) for understanding streaming video. One approach, HERMES, conceptualizes the KV cache as a hierarchical memory …