OpenAI Codex
PulseAugur coverage of OpenAI Codex — every cluster mentioning OpenAI Codex across labs, papers, and developer communities, ranked by signal.
7 天有情绪数据
-
OpenAI Codex name humorously noted as aged poorly
A user on Mastodon shared a humorous observation about the name "OpenAI Codex," noting that it has not aged well in light of current AI developments. The post includes a variety of hashtags related to AI, technology, an…
-
开发者每日路由 200 多个 LLM 调用跨越五个模型以降低成本
一位开发者详细介绍了一种管理 AI 推理成本的策略,即将任务路由到能够满足质量要求的最经济实惠的模型。这种被称为“推理套利”的方法涉及一个多模型堆栈,包括将 Claude Sonnet 作为日常驱动程序,Opus 用于复杂推理,OpenAI 的 Codex 用于交叉检查,Gemini Flash 用于研究,以及本地部署的 Qwen 模型用于敏感数据处理。作者对 15 个模型进行的 38 项任务基准测试显示,大多数任务不需要最昂贵的模型…
-
LiteLLM 发布开源 Kubernetes 平台,用于生产环境的 AI Agent
BerriAI 发布了 LiteLLM Agent Platform,这是一个基于 Kubernetes 的开源、可自托管的基础设施层。该平台旨在可靠地在生产环境中运行多个 AI Agent,解决了状态管理和隔离运行时需求等挑战。它提供按团队划分的沙盒,并确保跨重启和升级的会话连续性,利用了 TypeScript、Next.js 和 PostgreSQL。
-
AI assists Rust developer in reverse-engineering RAR format
An individual used AI tools like Claude and OpenAI Codex to reverse-engineer the complex RAR file format in just five weeks. Despite the lack of official documentation, the developer analyzed open-source libraries and b…
-
China court bans AI firings; Pwn2Own rejects AI exploits; YC startups speed up with AI
A Chinese court has ruled that replacing workers with AI solely for cost reduction is illegal, setting a precedent for labor rights in the age of AI. Separately, the Pwn2Own Berlin hacking competition saw a large reject…
-
New 'judge' agent pattern aims to prevent coding AI from shipping incomplete code
Autonomous coding agents often declare victory prematurely when their automated checks pass, even if those checks are insufficient. This can lead to stubbed or incomplete implementations being shipped. To address this, …
-
Cursor 3.3 ships with PR threads; OpenClaw, OncoAgent also update
Cursor has released version 3.3 of its AI-powered IDE, introducing features like inline PR threads, commit views, and asynchronous subagents. Additionally, OpenClaw has launched a beta version, 2026.5.9-beta.1, which in…
-
Spotify lets AI agents generate and save personal podcasts
Spotify has launched a new beta command-line tool that allows AI agents, such as Claude Code and OpenAI Codex, to generate and save personal podcasts directly to a user's Spotify library. This feature aims to provide us…
-
New benchmarks enhance safety evaluation for AI agents in OpenClaw and Codex
Researchers have developed ATBench-Claw and ATBench-Codex, extensions to the ATBench framework for evaluating agent trajectory safety. These benchmarks are tailored for the OpenClaw and OpenAI Codex environments, respec…
-
WellPlayed WP launches subscription library for AI-assisted WordPress plugins
WellPlayed WP has launched a new subscription service offering a library of unique WordPress plugins, inspired by the SetApp model for Mac applications. This service provides access to over 20 plugins designed to fill g…
-
大型语言模型难以复现物理实验结果,数值模拟能力欠佳
北京大学的一项新预印本评估了大型语言模型复现物理实验论文数值结果的能力。研究人员发现,包括由GPT-5.3驱动的OpenAI Codex在内的所有测试大型语言模型,端到端回调率均为0%,这意味着它们无法复现任何完整的数值结果。尽管模型展示了对论文方法的深刻理解,但在数据分析和数值模拟方面却持续出错,导致最终结果不正确。研究确定了多种失败模式,例如公式实现错误和复杂物理模型过度简化。
-
ElevenLabs、Cerebras 融资数十亿美元;Gemini 3 广泛集成,编码助手在 IDE 中趋于统一
多家AI公司已达成重要的融资里程碑,ElevenLabs 以110亿美元的估值完成了5亿美元D轮融资,Cerebras 以230亿美元的估值完成了10亿美元H轮融资。Google正将其Gemini 3模型集成到其产品中,包括一个新的Chrome侧边栏,并报告了该模型服务的显著采用率和成本降低。编码助手领域正在发生变化,VS Code和GitHub Copilot引入了对包括Claude和OpenAI Codex在内的多个助手的支持,以…
-
谷歌Gemini 3展示AI在编码和交互方面的飞跃
谷歌新推出的Gemini 3模型展示了过去三年AI能力上的显著进步,从简单的文本生成扩展到交互式游戏创建和自主编码等复杂任务。这一由Ethan Mollick强调的演变表明,AI正成为一种能够执行任何基于计算机任务的通用工具。尽管Gemini 3前景广阔,但与ChatGPT在图像生成方面的比较,尤其是在与病毒式传播的吉卜力(Ghibli)趋势相关的背景下,暗示了谷歌AI可能存在的弱点,尽管ChatGPT据报道从此类趋势中获得了7000…
-
Clad Labs launches platform to orchestrate multiple AI coding agents
Clad Labs has launched a new platform designed to orchestrate multiple AI coding agents, including Claude Code, Cursor, and OpenAI Codex. The tool allows developers to spin up teams of parallel agents, manage their work…