A new research paper introduces SCENE, a benchmark designed to evaluate how well AI models can recognize and adapt to social norms and sanctions in group chats. Early tests show that Anthropic's Claude Opus 4.7 and Google's Gemini 3.1 Pro perform significantly better than open-weight models on this benchmark. Meanwhile, Google has launched its Gemini Enterprise Agent Platform, integrating its Vertex AI with new features for building, managing, and securing AI agents, aiming to compete with offerings from OpenAI and Anthropic. AI
Summary written by gemini-2.5-flash-lite from 4 sources. How we write summaries →
IMPACT Google's new platform aims to simplify enterprise AI agent deployment, while new benchmarks push for more nuanced AI social interaction capabilities.
RANK_REASON Google announced a new enterprise platform for AI agents, and a research paper introduced a new benchmark for evaluating AI social capabilities, involving multiple frontier AI labs.