English(EN) Why I stopped using semantic embeddings for tool selection and switched back to BM25 [D]

开发者放弃使用语义嵌入，转而使用 BM25 进行 AI 代理工具选择

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-08 13:24

一位开发者在构建 AI 代理时发现，常用于工具选择的语义嵌入在生产环境中并不可靠。这些嵌入难以区分描述相似的工具，导致选择了错误的工具。在测试了三种检索策略后，基于 BM25 的搜索被证明是最有效的，通过索引工具名称、描述和 schema 字段，达到了 81% 的 top-1 准确率。 AI

影响强调了标准语义搜索在 AI 代理结构化工具选择中的局限性，提倡使用类似 BM25 的关键词匹配方法。

排序理由开发者分享了为 AI 代理工具选择测试检索策略的发现，比较了语义嵌入和 BM25。[lever_c_demoted from research: ic=1 ai=0.7]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/MachineLearning TIER_1 English(EN) · /u/AbjectBug5885 · 2026-06-08 13:24

我为何停止使用语义嵌入进行工具选择，转而重新使用 BM25 [D]

<div class="md"><p>I've been building agents for about a year and recently shipped one for a client running ~140 MCP-exposed tools at peak. Along the way I made the canonical mistake. I used cosine similarity over tool description embeddings to pick which tools the…