PulseAugur
EN
LIVE 10:11:15

Ornith 35B benchmarked against Gemma4 31B and Qwen3.6 35B

A new language model, Ornith 35B, has been benchmarked against Gemma4 31B and Qwen3.6 35B using the WebBrain's frozen browser-agent planner benchmark. While Ornith 35B shows promise and slightly outperforms Qwen3.6 35B in alignment, Gemma4 31B remains the superior choice according to the test results. AI

IMPACT This benchmark provides insights into the relative performance of different language models on agent planning tasks, informing developers about model selection for AI agent applications.

RANK_REASON The cluster reports on a benchmark comparison of AI models, which falls under research. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Ornith 35B benchmarked against Gemma4 31B and Qwen3.6 35B

COVERAGE [1]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Ornith 35B claims to beat Gemma4 31B and Qwen3.6 35B on many fronts. We ran it through WebBrain's frozen browser-agent planner benchmark. Result: Ornith is soli

    Ornith 35B claims to beat Gemma4 31B and Qwen3.6 35B on many fronts. We ran it through WebBrain's frozen browser-agent planner benchmark. Result: Ornith is solid — edges Qwen 3.6 on alignment — but Gemma4 remains the better option. https:// webbrain.one/blog/ornith-35b-w ebbrain-…