Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?
Researchers have introduced ToxiMol, a new benchmark designed to evaluate how well multimodal large language models (MLLMs) can repair toxic molecules. This benchmark includes a dataset of 660 toxic molecules across 11 tasks and an automated evaluation framework called ToxiEval. Initial experiments with 43 MLLMs show that while current models struggle with this task, they are beginning to exhibit promising abilities in understanding toxicity and performing structure-aware edits. AI
IMPACT Establishes a new evaluation standard for MLLMs in molecular toxicity repair, potentially guiding future drug development research.