ENTITY MathVista

MathVista

PulseAugur coverage of MathVista — every cluster mentioning MathVista across labs, papers, and developer communities, ranked by signal.

Total · 30d

5

5 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

5

5 over 90d

TIER MIX · 90D

significant 1
research 3
tool 1

TOPICS

RELATIONSHIPS

instance of Multimodal Multitask Multimedia Understanding 70%

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

RESEARCH · CL_90818 · Jun 12 · 16:55

Self-Improving VLMs Can Regress on New Tasks, Study Finds

A new research paper reveals that self-improving visual-language models (VLMs) can regress on new tasks, contrary to the assumption that stronger verifiers always yield stronger students. The study found that verifier q…
TOOL · CL_79897 · Jun 9 · 04:00

Research: Stage-1 training impacts VLM entropy, not final outcome

A new research paper explores the impact of different Stage-1 training methods on vision-language models (VLMs). The study found that while Stage-1 training, such as supervised fine-tuning (SFT) or on-policy distillatio…
RESEARCH · CL_18669 · May 5 · 16:36

UnAC method enhances LMMs for complex multimodal reasoning with adaptive prompting

Researchers have introduced UnAC, a novel multimodal prompting method designed to enhance the reasoning capabilities of Large Multimodal Models (LMMs) on complex visual tasks. This method employs adaptive visual prompti…
RESEARCH · CL_04920 · Apr 24 · 12:26

New CGC framework boosts multimodal LLMs for fine-grained image understanding

Researchers have introduced Compositional Grounded Contrast (CGC), a new framework designed to enhance the fine-grained multi-image understanding capabilities of Multimodal Large Language Models (MLLMs). This approach a…
FRONTIER RELEASE · CL_02354 · Apr 16 · 10:00

OpenAI's new models let ChatGPT think with images for advanced reasoning

OpenAI has introduced its latest visual reasoning models, o3 and o4-mini, which allow AI to "think with images" as part of its internal reasoning process. These models can perform image manipulations like cropping and z…