Researchers have developed a new framework to evaluate how well Multimodal Large Language Models (MLLMs) can identify misinformation in Chinese short videos. The study utilized a dataset of 200 videos annotated for deceptive patterns like experimental errors and logical fallacies. Results showed that Gemini-2.5-Pro performed best, achieving a belief score of 71.5, while another model, o3, performed poorly with a score of 35.2. The evaluation also revealed that MLLMs are susceptible to biases, such as those presented by authoritative channel IDs. AI
影响 This research highlights MLLM vulnerabilities to misinformation and biases, suggesting a need for improved robustness in multimodal AI systems.
排序理由 This is a research paper introducing a new evaluation framework and dataset for MLLMs.
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →