ScienceAgentBench
PulseAugur coverage of ScienceAgentBench — every cluster mentioning ScienceAgentBench across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
AI research systems gain failure-aware memory for improved performance
Researchers have developed a novel 'negative knowledge memory layer' designed to improve AI-assisted research systems. This system converts failed attempts into structured, typed records within a shared bank, which down…
-
D3-Gym dataset offers verifiable environments for AI scientific discovery
Researchers have introduced D3-Gym, a novel dataset designed to create verifiable environments for scientific data-driven discovery tasks. This dataset includes 565 tasks from real scientific repositories, each with ins…
-
DataPRM enhances LLM data analysis by rewarding scientific process
Researchers have developed DataPRM, a new process reward model designed to improve the performance of AI agents in dynamic data analysis tasks. Unlike previous models that struggled with silent errors and exploratory ac…