English(EN) So, Unsloth Studeo, same laptop as Ollama, Gemma4:e4B: (technically gemma-4-E4B-it-qat-GGUF · UD-Q4_K_XL) Message timing 111.3 tok/s. Tokens per second. This is

Unsloth Studeo 使用 Gemma4:e4B 模型实现 111.3 tokens/sec

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-23 01:59

Unsloth Studeo 在笔记本电脑上运行，使用 Gemma4:e4B 模型实现了每秒 111.3 个 token 的消息计时。用户称这个以每秒 token 数衡量的性能指标“很疯狂”，并指出该网页应用程序缺乏自动语音回复是一个缺点，但对可能编写解决方案表示乐观。 AI

影响展示了本地 AI 模型执行的具体性能基准。

排序理由用户报告了特定模型和软件组合的性能指标。

在 Mastodon — fosstodon.org 阅读 →

模型发布

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Unsloth Studeo 使用 Gemma4:e4B 模型实现 111.3 tokens/sec

报道来源 [1]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-23 01:59

Unsloth Studeo，与 Ollama 相同的笔记本电脑，Gemma4:e4B: (技术上是 gemma-4-E4B-it-qat-GGUF · UD-Q4_K_XL) 消息时间 111.3 tok/s。每秒 token 数。这是

So, Unsloth Studeo, same laptop as Ollama, Gemma4:e4B: (technically gemma-4-E4B-it-qat-GGUF · UD-Q4_K_XL) Message timing 111.3 tok/s. Tokens per second. This is just plain crazy. I mean, the web app doesn't automatically speak responses, which sucks, but like I may be able to vib…

报道来源 [1]

Unsloth Studeo，与 Ollama 相同的笔记本电脑，Gemma4:e4B: (技术上是 gemma-4-E4B-it-qat-GGUF · UD-Q4_K_XL) 消息时间 111.3 tok/s。每秒 token 数。这是

相关实体

相关话题