PulseAugur
实时 19:22:50
English(EN) Closing the Calibration Gap in Semantic Caching

新指标揭示语义缓存性能差距

研究人员发现,语义缓存系统在离线评估与实际部署中的性能之间存在显著差距。PR-AUC等标准指标未能考虑固定阈值下的实际可用性,导致选择不当。提出了新的指标,精确缓存命中率(P-CHR)AUC和校准保留率(CRR),以更好地衡量缓存性能以及部署过程中发生的质量下降。研究结果表明,改进语义缓存主要是一个校准问题,而非仅仅是数据扩展问题。 AI

影响 强调了在LLM推理优化中需要更好的评估指标,可能带来更具成本效益的部署。

排序理由 该集群包含一篇发表在arXiv上的研究论文,详细介绍了评估语义缓存系统的新指标。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

新指标揭示语义缓存性能差距

报道来源 [4]

  1. arXiv cs.CL TIER_1 English(EN) · Aditeya Baral, Radoslav Ralev, Iliya Sotirov Zhechev, Srijith Rajamohan, Jen Agarwal ·

    Closing the Calibration Gap in Semantic Caching

    arXiv:2606.19719v1 Announce Type: cross Abstract: Semantic caching cuts LLM inference costs by serving a cached response to semantically similar queries. Standard practice evaluates these systems using PR-AUC, a metric that only measures how well scores rank and ignores whether t…

  2. arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Jen Agarwal ·

    弥合语义缓存中的校准差距

    Semantic caching cuts LLM inference costs by serving a cached response to semantically similar queries. Standard practice evaluates these systems using PR-AUC, a metric that only measures how well scores rank and ignores whether they are usable at a fixed threshold. We show this …

  3. arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Jen Agarwal ·

    弥合语义缓存中的校准差距

    Semantic caching cuts LLM inference costs by serving a cached response to semantically similar queries. Standard practice evaluates these systems using PR-AUC, a metric that only measures how well scores rank and ignores whether they are usable at a fixed threshold. We show this …

  4. dev.to — LLM tag TIER_1 English(EN) · Machine coding Master ·

    Stop Wasting LLM Budgets: High-Performance Semantic Caching with Spring AI and pgvector

    <h2> Stop Wasting LLM Budgets: High-Performance Semantic Caching with Spring AI and pgvector </h2> <p>Your enterprise is likely bleeding thousands of dollars on duplicate LLM API calls because your Redis cache fails when a user asks "How do I reset my password?" instead of "Passw…