PulseAugur
实时 09:09:28
English(EN) Doing some # ai took testing today and I have watched both # Qwen3 .6 and # Gemma4 get into a loop while trying to hallucinate the code needed to solve the prob

Qwen 3.6 和 Gemma 4 模型在 AI 测试中出现循环

在测试中,观察到两个大型语言模型 Qwen 3.6Gemma 4 进入了重复循环,表明它们无法自我纠正并臆想代码。这种行为表明当前的 LLM 架构在可靠性和优化方面仍需重大改进,才能作为可靠的工具运行。测试是在本地进行的,导致两个模型都浪费了时间和负面的性能得分。 AI

影响 凸显了 LLM 可靠性和自我纠正方面持续存在的挑战,表明需要架构改进。

排序理由 该集群讨论了 AI 模型在测试中观察到的行为和局限性,属于研究和评估范畴。[lever_c_demoted from research: ic=1 ai=1.0]

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

报道来源 [1]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Doing some # ai took testing today and I have watched both # Qwen3 .6 and # Gemma4 get into a loop while trying to hallucinate the code needed to solve the prob

    Doing some # ai took testing today and I have watched both # Qwen3 .6 and # Gemma4 get into a loop while trying to hallucinate the code needed to solve the problem I was using to compare them. I wonder how many # tokens both burnt through by not being able to recognize they were …