A recent benchmark comparing the GLM 5.2 open-weights model against Gemini 3 Flash revealed that GLM 5.2 performs approximately 15% worse in text adventure games. While GLM 5.2 achieved about 15 achievements per attempt, Gemini 3 Flash averaged over eight. The GLM 5.2 model is currently priced higher than Gemini 3 Flash on OpenRouter, though its price is expected to decrease with more efficient deployment. Other models like Sonnet 4.5 and GPT 5.2 were found to be significantly less capable due to budget constraints. AI
IMPACT GLM 5.2's performance in text adventures suggests it may lag behind top-tier commercial models in certain complex reasoning tasks.
RANK_REASON The cluster details a benchmark comparing the performance of an open-weights model (GLM 5.2) against commercial models in a specific task (text adventures). [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →