The AI startup Poetiq has developed a self-optimizing harness that achieves new state-of-the-art performance on coding and ARC-AGI benchmarks. This harness, utilizing Google's Gemini 3 Flash model, has surpassed Anthropic's Claude Opus 4.7 in these evaluations. This recursive self-improvement technique represents a significant advancement in AI reasoning efficiency. AI
影响 Sets new SOTA on coding and ARC-AGI benchmarks, showcasing advancements in AI reasoning efficiency.
排序理由 The cluster reports on a new benchmark achievement for an AI system, which is a research milestone.
在 Mastodon — mastodon.social 阅读 →
AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →