Ensembles of Large Language Models for Identifying EQ-5D Studies in PubMed Based on Their Abstracts
Researchers have developed a novel framework using ensembles of Google's Gemini and Gemma large language models to automate the identification of EQ-5D studies within the PubMed database. This multi-phase approach integrates few-shot prompting, weight ensembling, and a soft stacking meta-classifier to improve accuracy and efficiency in screening biomedical literature. The weighted ensemble of Gemini 2.5 Pro, Gemma 3 12B, and Gemma 3 27B achieved a weighted F1-score of 0.74, outperforming individual models and demonstrating a reliable and scalable method for literature review automation. AI
IMPACT This research demonstrates a scalable approach for automating literature reviews in biomedical research, potentially accelerating scientific discovery.