Brief

last 24h

[2/2] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

SIGNIFICANT · 量子位 (QbitAI) 中文(ZH) · 4d · [15 sources]

Artificial Analysis Ranking: Qwen3.7 Wins Domestic Model Championship, Top 5 Globally

Alibaba's Qwen3.7-Max has been ranked the top-performing Chinese large language model and fifth globally by Artificial Analysis, a third-party evaluation platform. This new flagship model achieved a score of 56.6, surpassing other domestic models and nearing the capabilities of leading international models like GPT, Claude, and Gemini. Qwen3.7-Max is designed for agentic tasks, demonstrating significant advancements in programming, reasoning, and tool utilization, capable of handling complex, long-duration tasks with extensive tool calls. AI

IMPACT Sets a new benchmark for Chinese LLMs and signals increased competition at the frontier of global model performance.
TOOL · Towards AI English(EN) · 4d

AI Does Multiplication Underneath. So Why Did Older Models Break at School Maths?

Large language models, despite being built on mathematical operations like multiplication, have historically struggled with basic arithmetic, such as comparing decimal numbers. This issue stems from how models use multiplication not for direct calculation, but for transforming and relating information between tokens via learned weights. While modern models are improving, their inability to recognize their own errors highlights a fundamental difference between their internal processes and human understanding of mathematics. AI

IMPACT Highlights a gap in LLM reasoning, suggesting current models may not reliably perform basic arithmetic despite underlying mathematical operations.

Brief

Artificial Analysis Ranking: Qwen3.7 Wins Domestic Model Championship, Top 5 Globally

AI Does Multiplication Underneath. So Why Did Older Models Break at School Maths?