Researchers have developed MMIR-TCM, a new framework designed to improve clinical decision support in Traditional Chinese Medicine (TCM) by addressing the semantic gap between visual tongue features and textual reasoning. The framework integrates a multimodal large language model (MLLM) with memory-augmented segmentation and retrieval-augmented generation (RAG). It utilizes a three-stage architecture featuring a memory-SAM module for tongue extraction, a fine-tuned Qwen3-VL model for diagnosis generation, and a Qwen3-based RAG component for evidence-grounded support. MMIR-TCM was developed and validated using MedTCM, a new large-scale multimodal dataset, and evaluated with a domain-specific metric called TDEU, demonstrating superior performance over models like GPT-4o and Gemini 2.5 Flash. AI
IMPACT This research could lead to more accurate and reproducible diagnostic tools in Traditional Chinese Medicine, potentially improving patient outcomes.
RANK_REASON The cluster describes a new research paper detailing a novel framework and dataset for a specific domain. [lever_c_demoted from research: ic=1 ai=1.0]
- Gemini 2.5 Flash
- GPT-4o
- MedTCM
- MMIR-TCM
- multimodal large language model
- Qwen3
- Qwen3-VL
- TDEU
- Traditional Chinese Medicine
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →