A user is attempting to benchmark the DeepSeek 4 Pro model, but its servers are experiencing high load. The benchmark involves a complex reverse-engineering task to create a tool for building Apollo GraphQL hashes. So far, no open-weight models have successfully completed the benchmark, while proprietary models like Anthropic's Opus 4.7 and OpenAI's GPT 5.5 have demonstrated success. AI
影响 Provides comparative performance data for proprietary models on a complex reverse-engineering task.
排序理由 User is running a benchmark on a model and comparing results, which falls under research.
在 Mastodon — fosstodon.org 阅读 →
- Anthropic Opus 4.6
- Anthropic Opus 4.7
- Apollo GraphQL
- DeepSeek 4 Pro
- Gemini Pro 3.1
- GitHub Copilot
- Kimi K2.6
- Ollama
- OpenAI GPT-5.4
- OpenAI GPT 5.5
- OpenCode Go
- OpenRouter
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →