AlphaZero
PulseAugur coverage of AlphaZero — every cluster mentioning AlphaZero across labs, papers, and developer communities, ranked by signal.
3 天有情绪数据
-
MAPLE algorithm enhances AlphaZero for imperfect-information games
Researchers have developed a new tree search method called MAPLE, designed to improve the performance of AlphaZero-style algorithms in imperfect-information games. Unlike previous methods that struggle with strategy fus…
-
RL框架在Tamarin中自动化安全协议分析
研究人员开发了一个强化学习(RL)框架,以自动化和缩短使用Tamarin工具分析安全协议的过程。这种受AlphaZero启发的创新方法采用神经启发式来指导蒙特卡洛树搜索,并从已完成的子证明中学习。在16个案例研究上的评估表明,与现有方法相比,RL方法能够自动找到更多证明并生成更短的证明,从而显著减少了协议验证所需的人工努力。
-
Demis Hassabis的AI工作荣获诺贝尔化学奖
Demis Hassabis的开创性工作,包括AlphaGo、AlphaZero和AlphaFold,极大地推动了人工智能及其在科学中的应用。他的贡献于2024年与John Jumper和David Baker共同获得了诺贝尔化学奖。
-
New MCTS policies improve Monte Carlo Tree Search with variance awareness
Researchers have developed a new methodology called Inverse-RPO to systematically derive prior-based tree policies for Monte Carlo Tree Search (MCTS). This approach builds upon framing MCTS as a regularized policy optim…
-
Claude Opus 4.7 leads frontier agents in AI research acceleration benchmark
A new research paper proposes a benchmark to assess AI's ability to autonomously implement machine learning pipelines, aiming to detect early signs of recursive self-improvement. Frontier coding agents were tasked with …
-
DeepMind founder David Silver raises $1.1B for AI that learns without human data
Ineffable Intelligence, a new AI lab founded by former DeepMind researcher David Silver, has secured $1.1 billion in funding. The company aims to develop a "superlearner" that can acquire knowledge and skills autonomous…
-
DeepMind的AlphaGo负责人David Silver推出Ineffable Intelligence,获得Sequoia投资
David Silver,DeepMind的AlphaGo及其他AI项目的关键人物,已成立一家名为Ineffable Intelligence的新研究实验室。该实验室旨在创建一个“超级学习者”,通过直接经验而非现有数据的预训练来获取知识。这种根植于强化学习的方法,旨在使AI能够在科学和数学领域做出新颖的发现。
-
LLMs struggle to play video games, despite coding prowess, experts say
Despite rapid advancements in areas like coding, large language models (LLMs) demonstrate significant limitations when it comes to playing video games. While some models have achieved success in specific games, their pe…
-
Andrej Karpathy 讨论 Sutton 对 LLM 未能体现“苦涩教训”的批评
Andrej Karpathy 讨论了 Geoffrey Hinton 参与的一个播客,Hinton 质疑了人们普遍认为大型语言模型 (LLM) 完全体现了他的“苦涩教训”原则的观点。Hinton 认为 LLM 大量依赖有限的、人类生成的数据,这引发了对偏见和未来局限性的担忧。他将此与他设想的“儿童机器”进行了对比,这种机器通过动态世界互动进行学习,类似于动物的学习方式,而无需对人类文本进行广泛的预训练。Karpathy 同意当前的 …
-
业余爱好者瞄准使用机器学习赢得Trackmania的每日杯赛
本文详细介绍了一个旨在开发机器学习程序,使其能够在没有任何先验地图知识的情况下赢得Trackmania“每日杯赛”第一组比赛的项目。作者的动机是探索最先进的机器学习技术,这些技术可以由业余爱好者在一台计算机上实现,这与当前需要海量数据集和处理能力的模型形成对比。他们计划利用TMInterface等工具来处理前代游戏Trackmania Nations Forever,以实现这一目标。