ENTITY
Senior SWE Bench
Senior SWE Bench
PulseAugur coverage of Senior SWE Bench — every cluster mentioning Senior SWE Bench across labs, papers, and developer communities, ranked by signal.
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D
2 day(s) with sentiment data
RECENT · PAGE 1/1 · 2 TOTAL
-
New open-source benchmark evaluates AI agents as senior engineers
The Senior SWE-Bench is a new open-source benchmark designed to evaluate the capabilities of AI agents in performing tasks typically handled by senior software engineers. Developed by Snorkel AI, this benchmark aims to …
-
New Senior SWE Bench evaluates LLMs on underspecified software tasks
A new benchmark called Senior SWE Bench has been developed to evaluate large language models on tasks that are realistically underspecified. This benchmark focuses on feature tasks, aiming to better reflect real-world s…