Automated Standardization of Legacy Biomedical Metadata Using an Ontology-Constrained LLM Agent
Researchers have developed an LLM-based system to automatically standardize legacy biomedical metadata, addressing issues of incompleteness and non-compliance that hinder data usability. This system enhances LLMs by enabling them to query standard reporting guidelines and terminology services in real-time, retrieving accurate standards on demand. Evaluations on 839 HuBMAP legacy metadata records demonstrated that this real-time tool access consistently improved prediction accuracy compared to LLMs relying solely on their training data. AI
IMPACT This system could significantly improve the findability, interoperability, and reuse of biomedical datasets by automating metadata compliance.