ENTITY Qwen3-VL

Qwen3-VL

PulseAugur coverage of Qwen3-VL — every cluster mentioning Qwen3-VL across labs, papers, and developer communities, ranked by signal.

Total · 30d

26

26 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

23

23 over 90d

TIER MIX · 90D

significant 1
research 11
tool 14

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 8 TOTAL

TOOL · CL_27995 · May 11 · 15:34

Alibaba's Qwen-Image-2.0 model unifies image generation and editing

Alibaba's Qwen-Image-2.0 is a new foundation model designed for both high-fidelity image generation and precise editing within a single framework. It addresses limitations in existing models concerning ultra-long text r…
TOOL · CL_25786 · May 8 · 10:35

New framework enables remote sensing models to adapt to scale variations

Researchers have developed ScaleEarth, a novel framework for remote sensing vision-language models (RS-VLMs) that addresses the challenge of varying ground sampling distances (GSDs). Unlike previous methods that treat G…
RESEARCH · CL_14044 · May 1 · 17:54

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

Researchers have introduced Persistent Visual Memory (PVM), a novel module designed to address the "Visual Signal Dilution" problem in Large Vision-Language Models (LVLMs). This issue causes visual attention to weaken a…
RESEARCH · CL_11696 · May 1 · 04:00

WaferSAGE uses LLMs to analyze semiconductor defects with synthetic data

Researchers have developed WaferSAGE, a framework utilizing a 4B-parameter Qwen3-VL model for visual question answering on wafer defects in semiconductor manufacturing. The system addresses data scarcity by employing a …
RESEARCH · CL_06598 · Apr 28 · 04:00

Researchers develop precise video language models with human-AI oversight

Researchers have developed a new framework called CHAI (Critique-based Human-AI Oversight) to improve video captioning and generation. This method uses AI to generate initial captions, which are then refined by human ex…
RESEARCH · CL_08227 · Apr 28 · 01:21

Researchers probe VLM safety with embedding-guided typographic attacks

Researchers have developed a method to probe the safety vulnerabilities of vision-language models (VLMs) by using typographic prompt injections. Their study found that multimodal embedding distance strongly predicts att…
FRONTIER RELEASE · CL_01761 · Feb 16 · 05:44

Alibaba's Qwen3.5-397B-A17B model offers multimodal capabilities and efficient inference

Alibaba has released Qwen3.5-397B-A17B, an open-weight, natively multimodal model featuring a hybrid attention mechanism and sparse Mixture-of-Experts architecture. The model boasts support for 201 languages and demonst…
SIGNIFICANT · CL_01804 · Sep 23 · 05:44

Alibaba Cloud launches 7 new AI models and a $52B roadmap

Alibaba Cloud announced a significant expansion of its AI capabilities, releasing seven new models over a four-day period. Among these were the Qwen3-Max, Qwen3-Omni, and Qwen3-VL models, indicating advancements in vari…