VentureBeat
PulseAugur coverage of VentureBeat — every cluster mentioning VentureBeat across labs, papers, and developer communities, ranked by signal.
15 day(s) with sentiment data
-
Anthropic enhances Claude code artifacts for enterprise collaboration
Anthropic has released an update to its Claude code artifacts feature, introducing live shared dashboards and interactive workspaces tailored for enterprise users. This enhancement aims to improve collaboration and prod…
-
AWS Context knowledge graph learns from agents
Amazon Web Services (AWS) has developed a knowledge graph that can learn from agents, as reported by VentureBeat. This system aims to enhance how AWS Context understands and processes information by leveraging agentic l…
-
AI optimizer outperforms Claude Code and Codex by 2.5x
An AI optimizer has demonstrated superior performance compared to both Claude Code and Codex, achieving results 2.5 times faster. This advancement was reported by VentureBeat, highlighting the optimizer's efficiency in …
-
Z Air's GLM-5.2 outperforms GPT-5.5 on coding benchmarks at lower cost
Z Air has released GLM-5.2, an open-weights model that demonstrates superior performance on long-horizon coding benchmarks compared to GPT-5.5. Notably, GLM-5.2 achieves this performance at a fraction of the cost, speci…
-
AI agents: hype vs. reality in production deployments
The author argues that the current hype around AI agents is misleading, as many systems labeled as agents are merely sophisticated function calls. True agents, in the author's view, possess objectives, handle failures, …
-
Z Air's GLM-5.2 beats GPT-5.5 on coding benchmarks at lower cost · 3 sources tracked
Z Air has released its open-weight GLM-5.2 model, which reportedly outperforms GPT-5.5 on several long-horizon coding benchmarks. The new model achieves this performance at a significantly lower cost, approximately one-…
-
Anthropic overhauls Claude Design to cut token use and boost enterprise adoption
Anthropic has updated Claude Design to address token consumption issues that limited its enterprise adoption. The company has integrated Claude Design's usage limits with other Claude products like Claude Cowork and Cla…
-
Zhipu AI releases open-source GLM-5.2, challenging OpenAI amid regulatory probes · 1 source tracked
Zhipu AI has released GLM-5.2, an open-weights model that matches closed-source leaders like OpenAI's GPT-5.5 on coding benchmarks at a significantly lower cost. This move allows enterprises to self-host and fine-tune t…
-
Anthropic overhauls Claude Design, fixing token burn and boosting enterprise features · 6 sources tracked
Anthropic has significantly updated Claude Design, addressing its previous high token consumption and repositioning it as an enterprise-grade tool. The overhaul includes new features like design-system imports for brand…
-
AI Agents: Focus on Objectives and Failure Handling, Not Just Models
The author argues that the current definition of AI agents is too broad, leading to engineering mistakes. A true agent, they contend, possesses an objective and makes independent decisions, rather than merely executing …
-
AI-powered deception escalates, demanding machine-speed truth verification
The increasing use of AI by malicious actors to create deceptive content poses a significant challenge for defense systems. To counter these sophisticated attacks, there is a growing need for rapid, machine-speed truth …
-
Sakana AI launches enterprise research agent Marlin for 100-page reports
Sakana AI has launched its first commercial product, Sakana Marlin, an enterprise agent designed for in-depth research. This autonomous agent can run for up to eight hours, generating comprehensive reports of up to 100 …
-
AI Agents: Production Reality vs. Hype
The author argues that the current hype around AI agents is diluting the term, leading to engineering mistakes. A true agent, they contend, possesses an objective and can independently decide its next steps, handle fail…
-
AI Agents: Production Reality vs. Hype
The current hype around AI agents often oversimplifies their capabilities, leading to engineering missteps. A true agent is defined by having an objective and making independent decisions, rather than just executing ins…
-
Google researchers introduce 'faithful uncertainty' for LLMs
Google researchers have developed a new technique called 'faithful uncertainty' to improve the reliability of large language models. This method enables LLMs to express confidence in their answers, offering best guesses…
-
New $1,500 Foundation Model Rivals Larger LLMs
A new foundation model, developed by sigmoid.social, has been announced that costs only $1,500 to train. This model is reported to be competitive with larger, more expensive large language models. The development was hi…
-
AI Lab Successes Often Fail in Production Due to Data and Scale Issues
AI models that perform well in controlled laboratory settings frequently encounter challenges when deployed in real-world production environments. These failures often stem from discrepancies between training data and l…
-
AI Benchmarks Fall Short of Real-World Performance Metrics
AI benchmarks often fail to capture true real-world performance, according to an analysis. These benchmarks may not accurately reflect how AI models function in practical, dynamic environments. The discussion highlights…
-
NanoClaw and JFrog launch AI agent 'immune system' against malicious code
NanoClaw and JFrog have partnered to release a new security system designed to protect against malicious code being downloaded by AI agents. This 'immune system' automatically identifies and blocks harmful packages, gui…
-
GPT-5.5 Outperforms Claude Fable 5 on New AI Agent Benchmark
OpenAI's GPT-5.5 has reportedly outperformed Anthropic's Claude Fable 5 on the new Agents' Last Exam (ALE) benchmark. This benchmark, developed by UC Berkeley, evaluates AI agents' ability to perform complex, multi-step…