VietMed-MCQ: A Consistency-Filtered Data Synthesis Framework for Vietnamese Traditional Medicine Evaluation
Researchers have developed VietMed-MCQ, a new dataset designed to evaluate Large Language Models (LLMs) on Vietnamese Traditional Medicine. The dataset was generated using a Retrieval-Augmented Generation (RAG) pipeline with a novel consistency-checking mechanism to ensure accuracy. Benchmarking seven open-source models revealed that models with strong Chinese language priors performed better than Vietnamese-centric models, indicating potential for cross-lingual knowledge transfer, though complex diagnostic reasoning remains a challenge for all. AI
IMPACT Provides a specialized benchmark to improve LLM performance in low-resource, culturally specific medical domains.