Qwen 3.5
PulseAugur coverage of Qwen 3.5 — every cluster mentioning Qwen 3.5 across labs, papers, and developer communities, ranked by signal.
22 day(s) with sentiment data
-
Orthrus to release Qwen 3.5/3.6 and Gemma 4 models with open-source code
Orthrus, a project focused on training large language models, is preparing to release support for Qwen 3.5, Qwen 3.6, and Gemma 4 models. Alongside the model checkpoints, Orthrus will also open-source its complete train…
-
OpenAI internal AI usage surges; new open models and agent capabilities emerge
OpenAI has reported a significant increase in internal AI model usage across various departments, with median output tokens for Codex growing by as much as 56x in Research since November 2025. This surge in token consum…
-
DeepReinforce releases Ornith-1.0 open-source coding models that learn RL scaffolds
DeepReinforce has launched Ornith-1.0, a family of open-source coding models available under the MIT license. These models, built upon Gemma 4 and Qwen 3.5, are designed for agentic coding tasks and uniquely learn their…
-
vLLM performance boosted on AMD hardware with Qwen3.5
This article details how to optimize the vLLM inference engine for AMD hardware, specifically on a Lemonade Server. The author shares their experience fixing issues and achieving a threefold increase in batch throughput…
-
OpenMythos benchmarks released, highlights Qwen 3.6 discrepancies
The OpenMythos model has released its benchmarks, showcasing its performance across SWE-bench Pro, CyberGym, and cybench. While the model performs well for its size and cybersecurity focus, there's potential for further…
-
Qwen 3.5 agent training methods debated on Reddit
A user on Reddit is seeking advice on training a Qwen 3.5 model for multi-tool agent use. They are asking for guidance on whether to use supervised fine-tuning (SFT) followed by reinforcement learning (RL), or an RL-onl…
-
Qwen3.5-MoE fine-tune NEX-N2-mini shows strong reasoning with low token use
A fine-tuned version of the Qwen3.5-MoE model, named NEX-N2-mini, has been released and is showing promising results. Early tests suggest it offers reasoning capabilities comparable to or better than models like Qwen3.5…
-
User explores capabilities of OpenCode and Qwen 3.5 setup
The user is exploring the capabilities of their setup, which includes "openCode" and "Qwen 3.5". They are documenting their experiences to understand the strengths and limitations of this configuration.
-
AI coding tools like OpenCode and Kiro are being explored by users
Several users are exploring new AI coding tools and setups, sharing their experiences and learnings. One user documented key takeaways from a live-coding session with James Ward about Kiro, an AI tool for Java developme…
-
DeepReinforce AI releases Ornith-1.0 family of open-source coding models
DeepReinforce AI has released the Ornith-1.0 family of open-source models, designed for agentic coding tasks. The models, available in various sizes including 9B, 35B, and 397B parameters, are built upon Gemma 4 and Qwe…
-
Gemma 4 26b a4b praised for language and science tasks over Qwen
A Reddit user on the r/LocalLLaMA subreddit has found Gemma 4 26b a4b to be superior for language learning and scientific queries compared to other models like Qwen 3.5/3.6. While acknowledging Gemma 4's perceived weakn…
-
Google's Gemma 4.1 QAT series shows superior performance in benchmarks
Google's Gemma 4.1 model, specifically its Quantized Aware Training (QAT) series, has demonstrated significant superiority over its original version, according to benchmarks by WebBrain. This finding suggests that quant…
-
New decoding strategy bypasses LLM alignment tax for better reasoning
Researchers have introduced a novel decoding strategy called Confident Decoding, which aims to mitigate the "alignment tax" in large language models. This tax occurs when final layers of LLMs, after being fine-tuned for…
-
Poolside releases Laguna M.1 open-source AI model
American AI company Poolside has released the weights for its Laguna M.1 model, making it available as an open-source model under the Apache 2.0 license. This model, previously available only via API, boasts a 256K cont…
-
Community seeks compute for GLM5.2 distillation dataset to train smaller models
A user on the r/LocalLLaMA subreddit is requesting assistance from individuals with substantial computing resources to create a large distillation dataset from GLM5.2. The goal is to generate a dataset of 700,000 to 1 m…
-
HydraHead architecture fuses attention types for improved long-context LLMs
Researchers have introduced HydraHead, a novel architecture that hybridizes Full Attention and Linear Attention at the head level within transformer models. This approach leverages interpretability to identify critical …
-
S-Agent framework enhances VLMs for 3D spatial reasoning · 4 sources tracked
Researchers have introduced S-Agent, a novel framework designed to enhance visual language models (VLMs) for spatial reasoning in 3D environments. S-Agent integrates temporal memory and a hierarchy of spatial tools to e…
-
New methods advance simultaneous speech translation quality and evaluation
Researchers have developed new methods for evaluating and improving simultaneous speech translation systems, particularly for long-form content. One paper introduces a practical evaluation framework that measures senten…
-
AI Model Accused of Misrepresenting Origins, Apologizes for Error
A purported new open-source AI model, Rio-3.5-Open-397B, launched by Rio de Janeiro's IT company IplanRIO, has been accused of being a repackaged version of existing models, specifically a mix of Alibaba's Qwen 3.5 and …
-
New research explores hybrid and sparse attention mechanisms for LLMs
Researchers are exploring novel methods to optimize attention mechanisms in large language models, particularly for handling long contexts. The HydraHead architecture, for instance, hybridizes Full Attention (FA) and Li…