AI 新闻 —— February 23, 2026
PulseAugur 当天浮现的 5 条头条故事 —— 综合实验室、论文及开发者社区的信号进行排序。
-
Most AI models fail simple 'car wash' reasoning test, Opper finds
A new benchmark called the "Car Wash Test" reveals that many leading AI models struggle with basic reasoning. When asked whether to walk or drive 50 meters to a car wash, 42 out of 53 tested models incorrectly suggested walking. Even top-tier models like Claude Sonnet 4.5 and GP…
-
Claude CLI matches 700 PubMed papers to 200 clinical trials
A case study details how Claude, an AI assistant, was used to semantically link 700 PubMed papers to 200 clinical trials. The process involved evaluating potential matches based on drug aliases, study designs, and terminology across medical datasets. The FutureSearch tool, integ…
-
OpenAI partners with BCG, McKinsey, Accenture, and Capgemini for AI deployment
OpenAI has launched its Frontier Alliance Partners program, collaborating with major consulting firms like Boston Consulting Group, McKinsey & Company, Accenture, and Capgemini. These partnerships aim to help enterprises integrate and scale OpenAI's Frontier platform, which faci…
-
A Brief History of the History of Science
James Bryant Conant, a prominent organic chemist and President of Harvard, played a significant role in transforming the US into a scientific technocracy during the 20th century. He led initiatives like the National Defense Research Committee and advised on the atomic bomb's use…
-
Speech models fail on street names, especially for non-native speakers
Researchers at Together AI have found that current state-of-the-art speech recognition models exhibit a significant failure rate, averaging 39% error in transcribing street names, particularly for non-native English speakers who are 18% more likely to be misunderstood. This inac…