Gemini 3.1 Pro
PulseAugur coverage of Gemini 3.1 Pro — every cluster mentioning Gemini 3.1 Pro across labs, papers, and developer communities, ranked by signal.
- used by Gemini app 90%
- used by Vertex AI 90%
- instance of Google I/O 90%
- developed by Gemini Enterprise Agent Platform 90%
- used by Gemini API 90%
- instance of Gemini 3 Flash 90%
- developed by Artificial Analysis 90%
- competes with DeepSeek 80%
- competes with Claude Opus 4.6 70%
- used by arXiv 70%
- competes with Gemini 3.5 Flash 70%
- instance of Gemini app 70%
16 天有情绪数据
-
GPT-5 leads AI model usage rankings, outpacing benchmark champions
A new ranking system based on actual user adoption and discussion, rather than solely benchmark scores, reveals a significant divergence in AI model popularity. GPT-5 emerges as the top-ranked model by usage, despite ne…
-
AI agents fail real-world tasks, new SaaS-Bench reveals
A new benchmark called SaaS-Bench has revealed that current AI agents struggle significantly with real-world, long-horizon tasks, with top models like Claude Opus 4.7 achieving less than 4% success rate on fully complet…
-
Google's Gemini 3.5 Flash outperforms 3.1 Pro on coding and agents
Google's Gemini 3.5 Flash model has surpassed its predecessor, Gemini 3.1 Pro, on several key benchmarks, particularly in coding and agentic tasks. This new tier offers a significant cost reduction of 40% and approximat…
-
Frontier LLMs fall short in cybersecurity tasks, study finds
A new research paper evaluates the readiness of frontier large language models for cybersecurity tasks, finding that general-purpose models struggle with both vulnerability detection and security testing. The study test…
-
Tencent Hunyuan releases Hy-MT2 translation model with 33-language support
Tencent Hunyuan has released its new Hy-MT2 translation model, available in three sizes (1.8B, 7B, and 30B-A3B) and supporting 33 languages. The model demonstrates strong performance, with the 7B and 30B versions outper…
-
Alibaba's Qwen 3.6 open-weight model rivals frontier AI on coding tasks
Alibaba's Qwen 3.6 model family, particularly the 27B dense variant, has demonstrated performance competitive with leading frontier models like GPT-5.4 and Claude 4.6 on coding tasks. This open-weight model, runnable on…
-
Small Turkish LLM beats GPT-5.5, Claude Opus on e-commerce task
A researcher has demonstrated that a smaller, open-source Turkish language model can outperform frontier models like Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro on a specific e-commerce attribute extraction task. By fi…
-
Google launches Gemini 3.5 Flash for faster agentic tasks
Google has released Gemini 3.5 Flash, a new AI model designed for speed and agentic tasks. It is positioned as a faster and cheaper alternative to models like Anthropic's Claude Opus 4.7 and OpenAI's GPT-5.5 for tasks w…
-
New benchmark PPaint fuses preference and rating data for aesthetic scoring
Researchers have developed a new benchmark called PPaint for image aesthetic assessment, which uses both pairwise preferences and pointwise ratings from experts. This dual-protocol approach revealed that preferences pro…
-
Anthropic's Claude leads in AI safety benchmark, outperforming rivals
A new benchmark, DystopiaBench, reveals that Anthropic's Claude models continue to exhibit superior safety alignment compared to other leading LLMs. Across six dystopian scenarios, Claude consistently refused to generat…
-
New LivePI benchmark reveals AI agent vulnerabilities to prompt injection
Researchers have developed LivePI, a new benchmark designed to more realistically assess the risks of indirect prompt injection in AI agents. This benchmark simulates real-world scenarios across various input channels l…
-
Snowflake AI_COMPLETE adds video and audio analysis to SQL
Snowflake has released a public preview of a new multimodal capability for its AI_COMPLETE function, allowing users to directly input video and audio files. This update simplifies complex data analysis pipelines by enab…
-
Poetiq's Meta-System boosts LLM coding performance without fine-tuning
Poetiq has developed a Meta-System that automatically creates an inference harness, significantly improving LLM performance on coding benchmarks without any model fine-tuning. This system achieved state-of-the-art resul…
-
Omnimodal LLMs fail to act on detected sensory contradictions
Researchers have identified a "Representation-Action Gap" in omnimodal large language models, where models can internally recognize contradictions between textual claims and their sensory inputs but fail to reflect this…
-
Microsoft Research: LLMs corrupt 25% of documents in delegated tasks
A new benchmark, DELEGATE-52, developed by Microsoft Research, reveals that current large language models significantly corrupt documents during delegated workflows. Even advanced models like Gemini 3.1 Pro, Claude 4.6 …
-
Open-source AI workspace OpenGravity clones Google Antigravity
A developer has created OpenGravity, an open-source, zero-install JavaScript clone of Google's Antigravity AI workspace, designed to overcome rate-limiting issues. This tool offers a browser-based IDE with a live termin…
-
Snowflake previews multimodal AI analysis, Iceberg v3 GA
Snowflake has launched a public preview for its multimodal video and audio analysis capabilities, allowing users to extract insights from rich media directly within the platform. This new feature supports models like Cl…
-
New system MemPrivacy shields user data in edge-cloud AI agents
Researchers have developed MemPrivacy, a system designed to protect sensitive user information in LLM-powered agents that utilize cloud-assisted memory management. MemPrivacy identifies and masks private data on edge de…
-
Baidu's ERNIE 5.1 ranks top 4 in search, leveraging deep tech expertise
Baidu's ERNIE 5.1 model has achieved a top-4 ranking on the Search Arena leaderboard, surpassing models like Gemini 3.1 Pro and GPT-5.4 in search capabilities. This performance highlights Baidu's long-standing expertise…
-
Google DeepMind AI assists mathematicians, tops FrontierMath benchmark
Google DeepMind has released an AI system called "AI Co-Mathematician" designed to collaborate with human mathematicians on complex problems. This system, built on Gemini 3.1 Pro, achieved a new state-of-the-art score o…