实体 Gemini 2.5 Pro

Gemini 2.5 Pro

PulseAugur coverage of Gemini 2.5 Pro — every cluster mentioning Gemini 2.5 Pro across labs, papers, and developer communities, ranked by signal.

总计 · 30天

49

90 天内 49

发布 · 30天

0

90 天内 0

论文 · 30天

32

90 天内 32

层级分布 · 90 天

frontier release 2
significant 5
research 16
tool 24
commentary 2

关系

情绪 · 30 天

12 天有情绪数据

最近 · 第 3/3 页 · 共 49 条

TOOL · CL_17686 · Oct 28 · 14:13

LLM在“传递黄油”机器人测试中失败，得分远低于人类表现

一项名为Butter-Bench的新评估显示，当前最先进的大型语言模型在控制机器人执行实际任务方面存在显著困难。在旨在评估它们执行诸如传递黄油等家务的能力的测试中，表现最好的LLM仅达到40%的完成率，远低于人类95%的成功率。Gemini 2.5 Pro和Claude Opus 4.1等模型在空间意识和任务执行方面显示出局限性，突显了LLM推理能力与现实世界机器人应用之间的差距。
FRONTIER RELEASE · CL_01735 · Oct 23 · 18:54

Google DeepMind launches Deep Think for Gemini Ultra subscribers

Google DeepMind has released a new AI capability called Deep Think, now available to Google AI Ultra subscribers via the Gemini app. This feature utilizes parallel thinking techniques, allowing the model to explore mult…
FRONTIER RELEASE · CL_01739 · Jun 17 · 16:00

Google DeepMind 发布 Gemini 2.5 Pro 和 Flash 模型，并推出 Flash-Lite 预览版

Google DeepMind 已正式推出 Gemini 2.5 Pro 和 Flash 模型，使开发者能够自信地构建生产应用程序。该公司还推出了 Gemini 2.5 Flash-Lite 预览版，并称其为迄今为止成本效益最高、速度最快的模型。这些新版本在各种基准测试中提供了增强的性能，并保留了 100 万个 token 的上下文长度和多模态输入功能等关键特性。
FRONTIER RELEASE · CL_01711 · Jun 3 · 17:15

Google DeepMind enhances Gemini audio models for natural voice interactions and translation

Google DeepMind has released upgraded Gemini 2.5 audio models, enhancing capabilities for both live voice agents and text-to-speech generation. The Gemini 2.5 Flash Native Audio model now offers improved function callin…
FRONTIER RELEASE · CL_01837 · May 29 · 05:44

DeepSeek releases R1-0528, an open-weights model rivaling Gemini 2.5 Pro

DeepSeek has released DeepSeek-R1-0528, an open-weights model that rivals Gemini 2.5 Pro in performance. This release marks a significant advancement in publicly available AI models, offering a powerful alternative for …
RESEARCH · CL_00195 · Mar 21 · 21:34

AI code review bots show limits in automated evaluation, GitHub COO discusses ambient AI

A new paper explores the limitations of automated evaluation for AI code review bots, finding that current automated methods like G-Eval and LLM-as-a-Judge show only moderate alignment with human developer labels. The s…
FRONTIER RELEASE · CL_01724 · Feb 6 · 02:00

Google DeepMind releases Gemini 2.5 Flash-Lite, its fastest and cheapest model

Google DeepMind has released the stable version of Gemini 2.5 Flash-Lite, a fast and cost-efficient model designed for scaled production use. This model offers a balance of performance and affordability, with features l…
FRONTIER RELEASE · CL_00040 · Jun 25 · 07:02

Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI

Google DeepMind has released Gemini 3.1 Pro, an upgraded version of its core intelligence model, enhancing reasoning capabilities for complex problem-solving. This new model demonstrates significant improvements on benc…
RESEARCH · CL_00265 · Mar 30 · 15:00

Google AI teaches models to read maps and monitor nature

Google AI has developed a new system called MapTrace to train multimodal large language models (MLLMs) to visually follow routes on maps, addressing a gap in their spatial reasoning abilities. This system uses a scalabl…