SMDD-Bench: Can LLMs Solve Real-World Small Molecule Drug Design Tasks?
Researchers have introduced SMDD-Bench, a new benchmark designed to evaluate the capabilities of large language model agents in small molecule drug design. The benchmark comprises 502 task instances across five types, including scaffold hopping and lead optimization, involving 102 unique protein targets. Even the top-performing model, GPT-5.4, managed to solve only 40.2% of these complex tasks, highlighting the significant challenges that remain in achieving fully autonomous computational drug design. AI
IMPACT Highlights current limitations of LLM agents in complex scientific domains, guiding future research in autonomous drug design.