MathVista
PulseAugur coverage of MathVista — every cluster mentioning MathVista across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
Self-Improving VLMs Can Regress on New Tasks, Study Finds
A new research paper reveals that self-improving visual-language models (VLMs) can regress on new tasks, contrary to the assumption that stronger verifiers always yield stronger students. The study found that verifier q…
-
Research: Stage-1 training impacts VLM entropy, not final outcome
A new research paper explores the impact of different Stage-1 training methods on vision-language models (VLMs). The study found that while Stage-1 training, such as supervised fine-tuning (SFT) or on-policy distillatio…
-
UnAC method enhances LMMs for complex multimodal reasoning with adaptive prompting
Researchers have introduced UnAC, a novel multimodal prompting method designed to enhance the reasoning capabilities of Large Multimodal Models (LMMs) on complex visual tasks. This method employs adaptive visual prompti…
-
New CGC framework boosts multimodal LLMs for fine-grained image understanding
Researchers have introduced Compositional Grounded Contrast (CGC), a new framework designed to enhance the fine-grained multi-image understanding capabilities of Multimodal Large Language Models (MLLMs). This approach a…
-
OpenAI's new models let ChatGPT think with images for advanced reasoning
OpenAI has introduced its latest visual reasoning models, o3 and o4-mini, which allow AI to "think with images" as part of its internal reasoning process. These models can perform image manipulations like cropping and z…