PulseAugur
实时 03:55:51
English(EN) Stop paying for idle GPUs in your CI: batching LLM eval jobs

LLM的可靠性和成本效益推动新的基础设施解决方案

大型语言模型(LLM)在专业工作流程中的集成正从实验性使用转向基本工具,强调协作而非自动化。然而,这些LLM提供商的可靠性正成为一个关键问题,频繁的宕机需要强大的备用机制。为解决此问题,像Bifrost这样的开源解决方案正在网关层出现,用于管理自适应模型路由和备用逻辑,确保在提供商发生故障时应用程序也能正常运行。同时,优化CI/CD管道中LLM评估的成本至关重要,因为批处理作业和实施分层测试策略可以显著降低GPU支出。 AI

影响 随着LLM采用的增长,新兴的基础设施解决方案对于维持应用程序正常运行时间和降低运营成本至关重要。

排序理由 该集群讨论了管理LLM可靠性和成本效益的技术方法,包括自适应路由、备用逻辑和CI/CD优化策略。

在 dev.to — LLM tag 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

报道来源 [4]

  1. Mastodon — fosstodon.org TIER_1 Русский(RU) · [email protected] ·

    LLMs don't work for you. They work with you. Over the past couple of months, I've trained my team on how to integrate LLMs into the workflow. Not 'play with ChatGPT evenings'

    LLM не работает за вас. Она работает с вами За последние пару месяцев я обучил свою команду, как встроить LLM в рабочий процесс. Не «поиграться с ChatGPT вечером». Не «задать вопрос, как сделать то-то». А именно начать использовать LLM в реальной работе: код, тексты, анализ, ревь…

  2. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    RE: https:// toot.liw.fi/@liw/1166239921500 17126 @ liw @ baldur I'm still waiting for a solid study showing that using LLMs for coding helps. I haven't even se

    RE: https:// toot.liw.fi/@liw/1166239921500 17126 @ liw @ baldur I'm still waiting for a solid study showing that using LLMs for coding helps. I haven't even seen one that just shows that initial development of production-ready code is faster, ignoring maintenance issues down the…

  3. dev.to — LLM tag TIER_1 English(EN) · Kuldeep Paul ·

    Adaptive Model Routing and Fallback Logic: Routing Around LLM Provider Outages with Bifrost

    <p><em>When LLM providers go down, adaptive model routing and fallback logic keep applications online. Here is how Bifrost runs both at the gateway tier.</em></p> <p>At runtime, adaptive model routing decides where each request goes, choosing the LLM provider, the specific model,…

  4. dev.to — LLM tag TIER_1 English(EN) · claire nguyen ·

    Stop paying for idle GPUs in your CI: batching LLM eval jobs

    <p><strong>TL;DR: Running LLM evaluations on every PR will burn your GPU budget faster than you can blink. We cut our eval spend by about 60% by batching jobs into windowed runs on shared GPU pools, plus a smarter queue that knows the difference between a "smoke test" eval and a …