ENTITY Large Multimodal Models as Social Multimedia Analysis Engines

Large Multimodal Models as Social Multimedia Analysis Engines

PulseAugur coverage of Large Multimodal Models as Social Multimedia Analysis Engines — every cluster mentioning Large Multimodal Models as Social Multimedia Analysis Engines across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

10 over 90d

Releases · 30d

0 over 90d

Papers · 30d

10 over 90d

TIER MIX · 90D

TOPICS

paper 10
other 5
model release 3
safety 1
product 1

RELATIONSHIPS

instance of Large Multimodal Models 95%
instance of LMMs 95%

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 10 TOTAL

TOOL · CL_30558 · May 13 · 08:49

New FIKA-Bench tests AI knowledge acquisition beyond visual recognition

Researchers have introduced FIKA-Bench, a new benchmark designed to evaluate the ability of AI systems to acquire knowledge about unfamiliar objects, moving beyond simple visual recognition. The benchmark consists of 31…
RESEARCH · CL_27969 · May 11 · 17:59

New benchmarks reveal major gaps in multimodal context learning for LLMs

Two new benchmarks, MMCL-Bench and Personal-VCL-Bench, have been introduced to evaluate the multimodal context learning capabilities of large language models. MMCL-Bench focuses on learning from visual rules, procedures…
TOOL · CL_28006 · May 11 · 13:59

New method enhances LMM spatial reasoning with generated viewpoints

Researchers have introduced a new paradigm called Thinking with Novel Views (TwNV) to enhance the spatial reasoning capabilities of Large Multimodal Models (LMMs). This approach integrates generative novel-view synthesi…
TOOL · CL_25781 · May 8 · 12:07

New LithoBench benchmark reveals large multimodal model limitations

Researchers have introduced LithoBench, a new benchmark designed to evaluate the capabilities of large multimodal models in interpreting geological lithology from remote sensing data. This benchmark includes 10,000 expe…
RESEARCH · CL_18242 · May 5 · 15:56

New CC-OCR V2 benchmark reveals LMMs fall short in real-world document processing

A new benchmark, CC-OCR V2, has been released to evaluate Large Multimodal Models (LMMs) on real-world document processing tasks. The benchmark includes 7,093 challenging samples across five OCR-centric tracks, addressi…
TOOL · CL_15665 · May 5 · 04:00

New CSteer method guides large multimodal models to refer multiple regions without fine-tuning

Researchers have developed a new training-free method called Contextual Latent Steering (CSteer) to enhance the ability of Large Multimodal Models (LMMs) to accurately identify and refer to multiple specific regions wit…
RESEARCH · CL_18703 · May 5 · 02:05

VEBench benchmark evaluates large multimodal models for video editing tasks

Researchers have introduced VEBENCH, a new benchmark designed to evaluate Large Multimodal Models (LMMs) in real-world video editing tasks. The benchmark includes over 3.9K edited videos and 3,080 question-answer pairs,…
RESEARCH · CL_10260 · Apr 30 · 04:00

Tree-of-Evidence algorithm enhances multimodal AI interpretability

Researchers have developed a new method called Tree-of-Evidence (ToE) to improve the interpretability of Large Multimodal Models (LMMs). ToE frames model interpretability as an optimization problem, using lightweight "E…
RESEARCH · CL_10152 · Apr 30 · 04:00

Researchers develop Glance-or-Gaze to improve LMM visual search with adaptive focus

Researchers have introduced Glance-or-Gaze (GoG), a new framework designed to improve Large Multimodal Models (LMMs) in handling knowledge-intensive visual queries. Unlike previous methods that retrieve information indi…
RESEARCH · CL_05112 · Apr 27 · 04:00

New benchmark UNIKIE-BENCH evaluates large multimodal models for document information extraction

Researchers have introduced UNIKIE-BENCH, a new benchmark designed to systematically evaluate the performance of Large Multimodal Models (LMMs) in extracting key information from visual documents. The benchmark features…

New FIKA-Bench tests AI knowledge acquisition beyond visual recognition

New benchmarks reveal major gaps in multimodal context learning for LLMs

New method enhances LMM spatial reasoning with generated viewpoints

New LithoBench benchmark reveals large multimodal model limitations

New CC-OCR V2 benchmark reveals LMMs fall short in real-world document processing

New CSteer method guides large multimodal models to refer multiple regions without fine-tuning

VEBench benchmark evaluates large multimodal models for video editing tasks

Tree-of-Evidence algorithm enhances multimodal AI interpretability

Researchers develop Glance-or-Gaze to improve LMM visual search with adaptive focus

New benchmark UNIKIE-BENCH evaluates large multimodal models for document information extraction