MathVista
PulseAugur coverage of MathVista — every cluster mentioning MathVista across labs, papers, and developer communities, ranked by signal.
-
UnAC method enhances LMMs for complex multimodal reasoning with adaptive prompting
Researchers have introduced UnAC, a novel multimodal prompting method designed to enhance the reasoning capabilities of Large Multimodal Models (LMMs) on complex visual tasks. This method employs adaptive visual prompti…
-
New CGC framework boosts multimodal LLMs for fine-grained image understanding
Researchers have introduced Compositional Grounded Contrast (CGC), a new framework designed to enhance the fine-grained multi-image understanding capabilities of Multimodal Large Language Models (MLLMs). This approach a…
-
OpenAI's new models let ChatGPT think with images for advanced reasoning
OpenAI has introduced its latest visual reasoning models, o3 and o4-mini, which allow AI to "think with images" as part of its internal reasoning process. These models can perform image manipulations like cropping and z…