Brief

last 24h

[6/6] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · The Decoder English(EN) · 1d · [4 sources]

Google Deepmind's AlphaProof Nexus solves decades-old math problems for a few hundred dollars

Google DeepMind's AlphaProof Nexus has autonomously solved nine open Erdős mathematical problems, including two that had remained unsolved for 56 years. The AI system, which pairs a large language model with the Lean compiler for automatic proof verification, achieved these breakthroughs at a cost of a few hundred dollars per problem. This development showcases AI's growing capability in generating original mathematical solutions and formal verification. AI

IMPACT Demonstrates AI's capacity for original mathematical discovery and formal verification, potentially accelerating research in complex fields.
TOOL · Modal blog English(EN) · 3d

Building an RL Theorem

AE Studio, a consulting partner for Modal, has developed a workflow for training AI models to prove mathematical theorems using reinforcement learning. They compared two methods: Group Relative Policy Optimization (GRPO) and Evolution Strategies (ES), finding ES to be a promising alternative for this task. The setup leverages Modal's infrastructure for parallel GPU inference and isolated CPU verification, streamlining the research process and accelerating AI-enabled scientific discovery. AI

IMPACT Demonstrates a novel approach to AI-driven mathematical theorem proving, potentially accelerating AI-enabled scientific discovery.
TOOL · Mastodon — mastodon.social English(EN) · 6d

Using algebra and LLMs to verify a flight-plan bug fix in Lean https://jameshaydon.github.io/algebra-llms-lean-flight-plan/ # Programming # AI # Math

Researchers have utilized large language models (LLMs) in conjunction with algebraic methods to verify a bug fix within the Lean theorem prover. This approach focused on a specific flight-plan software component, demonstrating a novel application of AI in formal verification. The integration of LLMs aims to enhance the accuracy and efficiency of verifying complex software systems. AI

IMPACT Demonstrates a new method for using LLMs in formal software verification, potentially improving reliability in critical systems.
- Lean
- LLMs
- algebra
TOOL · arXiv cs.AI English(EN) · 4d

Advancing Mathematics Research with AI-Driven Formal Proof Search

Researchers have developed an AI agent capable of autonomously solving open mathematical problems by generating formal proofs in languages like Lean. This agent successfully resolved 9 out of 353 open Erdős problems and proved 44 out of 492 OEIS conjectures. The AI-driven formal proof search is being integrated into research across various mathematical fields, demonstrating its potential to advance scientific discovery. AI

IMPACT Demonstrates AI's growing capability in solving complex, open-ended research problems, potentially accelerating discovery across scientific disciplines.
RESEARCH · arXiv cs.CL English(EN) · 3d · [4 sources]

ImProver: Agent-Based Automated Proof Optimization

Researchers are exploring the use of agentic AI systems, particularly those leveraging large language models (LLMs), for complex tasks like program verification and mathematical theorem proving. Studies show these systems can achieve high success rates in generating valid specifications and certifying code, sometimes outperforming specialized models on new benchmarks. However, the research also highlights a growing gap between current AI capabilities and the rigor of existing verification benchmarks, suggesting a need for more robust evaluation methods. AI

IMPACT Agentic AI systems are demonstrating advanced capabilities in formal verification, potentially accelerating the development and reliability of complex software and mathematical proofs.
- Lean
- Riyaz Ahuja
- Large language models
- Ax-Prover
- Claude Code
- LLMs
- arXiv
- ImProver
RESEARCH · Hugging Face Daily Papers English(EN) · 1w · [2 sources]

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

Researchers have developed Lean Refactor, a new framework designed to optimize proofs generated by large language models (LLMs) in the Lean mathematical proof assistant. This system addresses key challenges such as proof length, compilation cost, and version compatibility, which are often in tension. By using a retrieval-augmented agentic approach with a curated database of refactoring strategies, Lean Refactor achieves significant compression rates and reduces compilation times, outperforming previous methods and demonstrating improved version transfer capabilities. AI

IMPACT Introduces a novel method for improving the efficiency and robustness of LLM-generated mathematical proofs, potentially accelerating formal verification efforts.
- Lean Refactor
- Claude Code
- Lean
- LLM

Brief

Google Deepmind's AlphaProof Nexus solves decades-old math problems for a few hundred dollars

Building an RL Theorem

Using algebra and LLMs to verify a flight-plan bug fix in Lean https://jameshaydon.github.io/algebra-llms-lean-flight-plan/ # Programming # AI # Math

Advancing Mathematics Research with AI-Driven Formal Proof Search

ImProver: Agent-Based Automated Proof Optimization

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search