PulseAugur
LIVE 06:20:55
frontier release · [1 source] ·
0
frontier release

Cosine Genie leverages GPT-4o fine-tuning to become top coding agent

Cosine has launched Genie, a coding agent that has achieved the top ranking on the SWE-Bench benchmark, surpassing previous leaders by a significant margin. This success is attributed to fine-tuning OpenAI's GPT-4o model on billions of tokens of synthetically generated code and runtime errors. OpenAI collaborated with Cosine on the scale and specifics of the fine-tuning process, including the dynamic sizing of LoRA adapters. Genie utilizes a four-stage workflow and is designed to output code in formats suitable for direct integration into codebases. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON New coding agent (Genie) from Cosine achieves state-of-the-art results on SWE-Bench using fine-tuned GPT-4o, a significant advancement in AI coding capabilities.

Read on Latent Space Podcast →

Cosine Genie leverages GPT-4o fine-tuning to become top coding agent

COVERAGE [1]

  1. Latent Space Podcast TIER_1 · Latent.Space ·

    Is finetuning GPT4o worth it? — with Alistair Pullen, Cosine (Genie)

    <p><a href="https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines" target="_blank">Betteridge's law</a> says no: with seemingly infinite flavors of RAG, and >2million token context + prompt caching from Anthropic/Deepmind/Deepseek, it's reasonable to believe that "in cont…