Researchers have developed a novel patient augmentation technique for data-scarce Multiple Instance Learning (MIL) in medical applications. This method generates realistic patient data in the embedding space using Gaussian Mixture Models to learn disease-specific instance distributions. The approach can create new patients by remixing pooled embeddings, even without examples from all categories, and selects generated patients based on uncertainty quantification to enhance MIL performance. Experiments across various scarcity scenarios, including cross-dataset transfer and low-data regimes for single-cell RNA-seq and flow cytometry, show improved performance over existing methods, with one scenario achieving performance comparable to full-dataset training. AI
IMPACT This method could significantly improve diagnostic capabilities for rare diseases by enabling effective model training with limited data.
RANK_REASON Research paper published on arXiv detailing a new method for data augmentation in medical machine learning. [lever_c_demoted from research: ic=1 ai=1.0]
- arXiv
- flow cytometry
- Gaussian Mixture Models
- Muhammed Furkan Dasdelen
- Multiple instance learning
- single-cell RNA-seq
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →