PulseAugur
实时 02:51:58
实体 Qwen3-VL

Qwen3-VL

PulseAugur coverage of Qwen3-VL — every cluster mentioning Qwen3-VL across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
9
90 天内 9
发布 · 30天
0
90 天内 0
论文 · 30天
7
90 天内 7
层级分布 · 90 天
情绪 · 30 天

3 天有情绪数据

最近 · 第 1/1 页 · 共 9 条
  1. TOOL · CL_36060 ·

    VLMs fail to re-examine images when prompted, study finds

    Researchers have developed a new framework called VisualSwap to test whether Vision-Language Models (VLMs) truly re-examine images when they claim to. Their experiments using the VS-Bench dataset on models like Qwen3-VL…

  2. RESEARCH · CL_27995 ·

    Alibaba's Qwen unveils advanced image generation and VAE models

    Alibaba's Qwen team has released technical reports for two new image models: Qwen-Image-VAE-2.0 and Qwen-Image-2.0. Qwen-Image-VAE-2.0 is a high-compression Variational Autoencoder designed for improved reconstruction f…

  3. TOOL · CL_25786 ·

    新框架使遥感模型能够适应尺度变化

    研究人员开发了ScaleEarth,一个新颖的遥感视觉语言模型(RS-VLMs)框架,解决了地面采样距离(GSD)变化带来的挑战。与先前将GSD视为离散token的方法不同,ScaleEarth使用连续条件变量,根据物理尺度动态调整模型的计算路径。该方法通过CS-HLoRA和SSE-U进行GSD预测,在遥感基准测试中取得了最先进的成果。

  4. RESEARCH · CL_14044 ·

    Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

    Researchers have introduced Persistent Visual Memory (PVM), a novel module designed to address the "Visual Signal Dilution" problem in Large Vision-Language Models (LVLMs). This issue causes visual attention to weaken a…

  5. RESEARCH · CL_11696 ·

    WaferSAGE uses LLMs to analyze semiconductor defects with synthetic data

    Researchers have developed WaferSAGE, a framework utilizing a 4B-parameter Qwen3-VL model for visual question answering on wafer defects in semiconductor manufacturing. The system addresses data scarcity by employing a …

  6. RESEARCH · CL_06598 ·

    Researchers develop precise video language models with human-AI oversight

    Researchers have developed a new framework called CHAI (Critique-based Human-AI Oversight) to improve video captioning and generation. This method uses AI to generate initial captions, which are then refined by human ex…

  7. RESEARCH · CL_08227 ·

    Researchers probe VLM safety with embedding-guided typographic attacks

    Researchers have developed a method to probe the safety vulnerabilities of vision-language models (VLMs) by using typographic prompt injections. Their study found that multimodal embedding distance strongly predicts att…

  8. FRONTIER RELEASE · CL_01761 ·

    Alibaba's Qwen3.5-397B-A17B model offers multimodal capabilities and efficient inference

    Alibaba has released Qwen3.5-397B-A17B, an open-weight, natively multimodal model featuring a hybrid attention mechanism and sparse Mixture-of-Experts architecture. The model boasts support for 201 languages and demonst…

  9. SIGNIFICANT · CL_01804 ·

    Alibaba Cloud launches 7 new AI models and a $52B roadmap

    Alibaba Cloud announced a significant expansion of its AI capabilities, releasing seven new models over a four-day period. Among these were the Qwen3-Max, Qwen3-Omni, and Qwen3-VL models, indicating advancements in vari…