Brief

last 24h

[5/5] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · X — Omar Sanseviero (HF research) English(EN) · 5h

New research from Microsoft Research

Microsoft Research has released new findings in the field of human-factor research. The details of this research are not yet publicly available, but it is expected to shed light on the interaction between humans and AI systems. Further information is anticipated to be shared by Omar Sanseviero, a key figure in the research. AI

IMPACT New research from Microsoft may offer insights into human-AI interaction.
- Microsoft Research
- Omar Sanseviero
RESEARCH · MarkTechPost English(EN) · 1d · [3 sources]

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

Microsoft Research has developed Webwright, an open-source framework that allows AI agents to interact with the web using a terminal-based approach. Unlike traditional agents that act one step at a time in a browser, Webwright agents write and execute Playwright code, bash commands, and inspect logs within a terminal environment. This method significantly improves performance, achieving 60.1% on the Odysseys benchmark, a substantial increase from the 33.5% scored by a base GPT-5.4 model using a conventional screenshot-based agent setting. AI

IMPACT Enables AI agents to perform complex web tasks more effectively by adopting a code-centric development approach, potentially improving automation and data extraction.
SIGNIFICANT · MarkTechPost English(EN) · 3d · [2 sources]

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

Microsoft Research has introduced Fara1.5, a series of three browser computer-use agent models (4B, 9B, and 27B parameters) built upon Qwen3.5. These agents are designed to interact with real browsers by interpreting screenshots and executing mouse and keyboard actions to complete tasks. In evaluations on the Online-Mind2Web benchmark, the largest Fara1.5 model achieved a 72% task success rate, surpassing competitors like OpenAI's Operator and Google's Gemini 2.5 Computer Use. AI

IMPACT Sets a new benchmark for browser automation agents, potentially impacting how users interact with web services and how developers build agentic applications.
TOOL · Microsoft Research English(EN) · 4d

Vega: Zero-knowledge proofs for digital identity in the age of AI

Microsoft Research has developed Vega, a system that uses zero-knowledge proofs to enable users to verify aspects of their digital identity, such as age or professional status, without revealing the underlying credential. This technology aims to address privacy concerns exacerbated by the rise of AI agents and the increasing need for secure digital verification. Vega generates proofs quickly on standard devices and is designed to integrate with existing formats like driver's licenses and EU digital identity wallets. AI

IMPACT Enables secure and private credential verification for AI agents and digital identity systems.
TOOL · Bluesky Jetstream — AI desk English(EN) · 1w

“Whimsey attacks” that seem absurd (“I cannot pay that much because of the Geneva Convention”) work against AI agents because guardrails are weak against out-of

Researchers have identified a new type of AI vulnerability called "whimsey attacks," which exploit weaknesses in AI agents' guardrails by using absurd, out-of-distribution arguments. These attacks, even those that seem nonsensical, can successfully trick AI agents, with smaller models being particularly susceptible, though larger models can also be affected. This discovery highlights a significant challenge in developing robust AI safety measures. AI

IMPACT Highlights a new class of AI vulnerabilities that could impact the reliability and safety of AI agents.

Brief

New research from Microsoft Research

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

Vega: Zero-knowledge proofs for digital identity in the age of AI

“Whimsey attacks” that seem absurd (“I cannot pay that much because of the Geneva Convention”) work against AI agents because guardrails are weak against out-of