LMMs
PulseAugur coverage of LMMs — every cluster mentioning LMMs across labs, papers, and developer communities, ranked by signal.
-
UnAC method enhances LMMs for complex multimodal reasoning with adaptive prompting
Researchers have introduced UnAC, a novel multimodal prompting method designed to enhance the reasoning capabilities of Large Multimodal Models (LMMs) on complex visual tasks. This method employs adaptive visual prompti…
-
New CSteer method guides large multimodal models to refer multiple regions without fine-tuning
Researchers have developed a new training-free method called Contextual Latent Steering (CSteer) to enhance the ability of Large Multimodal Models (LMMs) to accurately identify and refer to multiple specific regions wit…
-
Researchers develop Glance-or-Gaze to improve LMM visual search with adaptive focus
Researchers have introduced Glance-or-Gaze (GoG), a new framework designed to improve Large Multimodal Models (LMMs) in handling knowledge-intensive visual queries. Unlike previous methods that retrieve information indi…
-
New benchmark UNIKIE-BENCH evaluates large multimodal models for document information extraction
Researchers have introduced UNIKIE-BENCH, a new benchmark designed to systematically evaluate the performance of Large Multimodal Models (LMMs) in extracting key information from visual documents. The benchmark features…