English(EN) Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

纯代码脚本在ARC-AGI-3基准测试中表现优于LLM

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-05 01:11

一位程序员展示了，一个简单的Python脚本，运行在一台十年前的AMD CPU上，可以在新的ARC-AGI-3基准测试中取得4.76%的成绩。这一壮举凸显了当前大型语言模型的低效性，它们在基准测试的动态、无指令环境中挣扎，并且常常得分零。该脚本利用了基本的计算机视觉技术，如质心检测，来解决空间谜题，尽管其资源需求极低且没有使用AI token，但表现优于许多AI模型。 AI

影响证明了在特定基准测试中，传统编程可以优于当前的LLM，凸显了LLM的低效性。

排序理由该集群描述了一种基准测试的新方法，展示了一种非AI方法在与AI模型的竞争中的表现。

在 r/MachineLearning 阅读 →

其他

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

r/MachineLearning TIER_1 English(EN) · /u/-SLOW-MO-JOHN-D · 2026-06-05 01:11

抛弃LLMs。使用纯代码、2012年AMD CPU和零AI token，在全新的ARC-3上得分4.76%。[P]

<table> <tr><td> <a href="https://www.reddit.com/r/MachineLearning/comments/1tx6g3i/scrap_the_llms_scoring_476_on_the_brand_new_arc3/"> <img alt="Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]" src="https://preview.red…
r/Anthropic TIER_1 English(EN) · /u/-SLOW-MO-JOHN-D · 2026-06-05 01:12

抛弃LLMs。在全新的ARC-3上，仅用纯代码、一台2012年的AMD CPU和零AI token，得分4.76%。[P]

<table> <tr><td> <a href="https://www.reddit.com/r/Anthropic/comments/1tx6hd2/scrap_the_llms_scoring_476_on_the_brand_new_arc3/"> <img alt="Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]" src="https://preview.redd.it/x…

报道来源 [2]

抛弃LLMs。使用纯代码、2012年AMD CPU和零AI token，在全新的ARC-3上得分4.76%。[P]

抛弃LLMs。在全新的ARC-3上，仅用纯代码、一台2012年的AMD CPU和零AI token，得分4.76%。[P]

相关实体

相关话题