Large Vision Language Models
PulseAugur coverage of Large Vision Language Models — every cluster mentioning Large Vision Language Models across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
New framework estimates LVLM confidence by contrasting image-based predictions
Researchers have developed a new framework called BICR (Blind-Image Contrastive Ranking) to assess the confidence of Large Vision-Language Models (LVLMs). This method helps distinguish between predictions genuinely info…
-
Composer framework advances aesthetic image generation via composition transfer
Researchers have developed Composer, a new framework designed to improve the aesthetic quality of generated images by explicitly modeling composition. This approach separates composition from semantics, allowing for com…
-
New VIDA dataset tackles ambiguity in multimodal machine translation
Researchers have introduced VIDA, a new dataset designed to tackle ambiguity in multimodal machine translation. The dataset contains 2,500 instances where visual context is crucial for resolving ambiguous expressions. E…