Researchers have introduced SupraBench, a new benchmark designed to evaluate the capabilities of large language models (LLMs) in the field of supramolecular chemistry. The benchmark addresses the need for systematic evaluation of LLMs on tasks such as binding affinity prediction and host-guest reasoning, which are crucial for accelerating the design of molecular assemblies. Alongside SupraBench, a 16M-token corpus named SupraPMC was released to aid in adapting LLMs to this specialized domain. Initial benchmarking revealed significant room for improvement across various LLMs, with domain adaptation showing mixed results depending on the task. AI
RANK_REASON The cluster describes the release of a new academic benchmark and associated dataset for evaluating LLMs in a specialized scientific domain.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →