Русский(RU) Прогнал 6 апрельских LLM через battle test. Победил не самый новый и не самый дорогой DeepSeek V4 Pro вышел 24 апреля. Огромная модель, топ AIME и SWE-bench, пе

Qwen 3.6 Plus 在价格和质量基准测试中胜过 DeepSeek V4 Pro

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-28 10:50

最近对六个四月发布的大型语言模型 (LLM) 进行的一次实测显示，Qwen 3.6 Plus（发布于 22 天前）的表现优于更新的 DeepSeek V4 Pro。尽管 DeepSeek V4 Pro 拥有先进的推理架构，并在 AIME 和 SWE-bench 上取得了最高分，但在测试中仅获得 89 分，而 Qwen 3.6 Plus 得分为 92 分。测试还突显了显著的成本差异，DeepSeek 的 Flash 版本比其 Pro 版本便宜 13 倍，但得分也较低。 AI

影响 Qwen 3.6 Plus 相较于 DeepSeek V4 Pro 等更新的模型在性能和成本效益方面更胜一筹，这表明在生产 LLM 选择方面可能出现最优选择的转变。

排序理由该集群报告了多个 LLM 的比较基准测试结果，属于研究范畴。

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

Mastodon — fosstodon.org TIER_1 Русский(RU) · [email protected] · 2026-04-28 10:52

Ran 6 April LLMs through a battle test. The winner was not the newest or most expensive, DeepSeek V4 Pro, released on April 24th. Huge model, top AIME and SWE-bench, pe

Прогнал 6 апрельских LLM через battle test. Победил не самый новый и не самый дорогой DeepSeek V4 Pro вышел 24 апреля. Огромная модель, топ AIME и SWE-bench, передовая reasoning-архитектура. Я ждал Tier S — 95+ из 100 в нашем battle test на русском контенте. Получил 89. Запустил …

链接 habr.com/…/1029044
Mastodon — fosstodon.org TIER_1 Русский(RU) · [email protected] · 2026-04-28 10:50

Ran 6 April LLMs through a battle test. The winner was not the newest or the most expensive, DeepSeek V4 Pro, released on April 24th. Huge... #LLM #DeepSeek #Qwen #Kimi #Benc

Прогнал 6 апрельских LLM через battle test. Победил не самый новый и не самый дорогой DeepSeek V4 Pro вышел 24 апреля. Огром... #LLM #DeepSeek #Qwen #Kimi #Benchmarks #AI #OpenRouter #Russian #NLP Origin | Interest | Match

链接 habr.com/…/1029044 awakari.com/sub-details.html awakari.com/pub-msg.html

报道来源 [2]

Ran 6 April LLMs through a battle test. The winner was not the newest or most expensive, DeepSeek V4 Pro, released on April 24th. Huge model, top AIME and SWE-bench, pe

Ran 6 April LLMs through a battle test. The winner was not the newest or the most expensive, DeepSeek V4 Pro, released on April 24th. Huge... #LLM #DeepSeek #Qwen #Kimi #Benc

相关实体

相关话题