visual question answering
PulseAugur coverage of visual question answering — every cluster mentioning visual question answering across labs, papers, and developer communities, ranked by signal.
1 天有情绪数据
-
新的VQA基准和方法解决了知识、适应性和关联性问题
研究人员推出了几个新的视觉问答(VQA)系统基准和方法。HyLoVQA提出了一种动态超网络生成的低秩适应技术,用于持续VQA,提高了对新任务和对象的适应性。WikiVQABench提供了一个使用维基百科和维基数据的知识增强型VQA基准,旨在测试需要外部知识的模型。此外,UCSF-PDGM-VQA专注于脑肿瘤MRI解读,突出了当前VLM在临床环境中的局限性,而RoboSurg-VQA则解决了手术分割感知的VQA问题,VISTAQA则对答…
-
Researchers develop new methods for knowledge graph retrieval and completion
Researchers have developed new frameworks to enhance knowledge graph completion and visual question answering by integrating multimodal knowledge graphs with retrieval-augmented generation techniques. One approach, RADD…
-
HAC adapts CLIP to hyperbolic space for zero-shot VQA tasks
Researchers have introduced HAC, a novel framework that adapts pre-trained CLIP models to hyperbolic geometry for improved zero-shot Visual Question Answering (VQA). This parameter-efficient approach allows existing CLI…
-
New benchmarks SpecVQA and M3-VQA challenge multimodal LLMs in scientific and multi-hop reasoning
Researchers have introduced M$^3$-VQA, a new benchmark designed to evaluate multimodal large language models (MLLMs) on complex reasoning tasks involving multiple entities and multi-hop inference. The benchmark challeng…