PulseAugur
LIVE 04:23:09
research · [2 sources] · · Русский(RU) Прогнал 6 апрельских LLM через battle test. Победил не самый новый и не самый дорогой DeepSeek V4 Pro вышел 24 апреля. Огромная модель, топ AIME и SWE-bench, пе
0
research

Qwen 3.6 Plus outperforms DeepSeek V4 Pro in price and quality benchmarks

A recent battle test of six April-released Large Language Models (LLMs) revealed that the Qwen 3.6 Plus, released 22 days prior, outperformed the newer DeepSeek V4 Pro. Despite DeepSeek V4 Pro's advanced reasoning architecture and top scores on AIME and SWE-bench, it achieved 89 points in the test, while Qwen 3.6 Plus scored 92. The test also highlighted a significant cost disparity, with DeepSeek's Flash variant being 13 times cheaper than its Pro version, though also scoring lower. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Qwen 3.6 Plus's superior performance and cost-effectiveness over newer models like DeepSeek V4 Pro suggest a shift in optimal production LLM choices.

RANK_REASON The cluster reports on comparative benchmark results for multiple LLMs, which falls under research.

Read on Mastodon — fosstodon.org →

COVERAGE [2]

  1. Mastodon — fosstodon.org TIER_1 Русский(RU) · [email protected] ·

    Ran 6 April LLMs through a battle test. The winner was not the newest or most expensive, DeepSeek V4 Pro, released on April 24th. Huge model, top AIME and SWE-bench, pe

    Прогнал 6 апрельских LLM через battle test. Победил не самый новый и не самый дорогой DeepSeek V4 Pro вышел 24 апреля. Огромная модель, топ AIME и SWE-bench, передовая reasoning-архитектура. Я ждал Tier S — 95+ из 100 в нашем battle test на русском контенте. Получил 89. Запустил …

  2. Mastodon — fosstodon.org TIER_1 Русский(RU) · [email protected] ·

    Ran 6 April LLMs through a battle test. The winner was not the newest or the most expensive, DeepSeek V4 Pro, released on April 24th. Huge... #LLM #DeepSeek #Qwen #Kimi #Benc

    Прогнал 6 апрельских LLM через battle test. Победил не самый новый и не самый дорогой DeepSeek V4 Pro вышел 24 апреля. Огром... #LLM #DeepSeek #Qwen #Kimi #Benchmarks #AI #OpenRouter #Russian #NLP Origin | Interest | Match