ENTITY Multi-modal Large Language Models

Multi-modal Large Language Models

PulseAugur coverage of Multi-modal Large Language Models — every cluster mentioning Multi-modal Large Language Models across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

8 over 90d

Releases · 30d

0 over 90d

Papers · 30d

8 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/1 · 8 TOTAL

RESEARCH · CL_107688 · Jun 22 · 19:27

New 'Ground Then Rank' method boosts knowledge-based visual question answering

Researchers have developed a new framework called "Ground Then Rank" (GTR) to improve Knowledge-Based Visual Question Answering (KB-VQA) performance. This method decouples entity identification from evidence ranking, ad…
RESEARCH · CL_105257 · Jun 22 · 16:16

New benchmarks and methods tackle visual document retrieval challenges

Researchers have developed new methods to improve visual document retrieval, particularly for large collections of similar documents like invoices. One approach, Invoice Haystack, introduces a benchmark designed to stre…
RESEARCH · CL_84430 · Jun 10 · 09:30

New TASM framework boosts MLLM efficiency with structured memory

Researchers have developed a new framework called TASM (Task-Aware Structured Memory) to improve the efficiency of multi-modal large language models (MLLMs). This training-free approach addresses the limitations of curr…
RESEARCH · CL_79694 · Jun 8 · 09:21

New benchmarks and frameworks enhance video temporal grounding

Researchers have introduced new benchmarks and frameworks for improving temporal grounding in long-form videos. One study posits that hour-scale video grounding is primarily a search problem, not a recognition one, and …
RESEARCH · CL_79606 · Jun 8 · 07:19

LLM privacy research tackles Japanese data, multi-modal risks, and DP adaptation

Researchers are exploring privacy risks associated with large language models (LLMs) and their adaptations. One study focuses on detecting sensitive personal information in Japanese pre-training corpora, developing a cl…
TOOL · CL_65341 · Jun 2 · 04:00

Survey details LLM and MM-LLM use in transportation operations

A new survey paper explores the application of large language models (LLMs) and multi-modal large language models (MM-LLMs) in transportation systems management and operations. The research synthesizes current studies a…
RESEARCH · CL_36921 · May 12 · 20:49

AI agents learn human beliefs and spatial reasoning

Researchers are exploring how AI agents can better understand human beliefs and intentions, particularly in interactive scenarios. One paper proposes a second-order Theory of Mind (ToM-2) framework using I-POMDP to enab…
RESEARCH · CL_27982 · May 11 · 16:49

AI research questions video anomaly detection framing

Two new research papers challenge the current direction of video anomaly detection (VAD). The first paper argues that the field's focus on general models and multi-modal large language models (MLLMs) has shifted focus a…

New 'Ground Then Rank' method boosts knowledge-based visual question answering

New benchmarks and methods tackle visual document retrieval challenges

New TASM framework boosts MLLM efficiency with structured memory

New benchmarks and frameworks enhance video temporal grounding

LLM privacy research tackles Japanese data, multi-modal risks, and DP adaptation

Survey details LLM and MM-LLM use in transportation operations

AI agents learn human beliefs and spatial reasoning

AI research questions video anomaly detection framing