LVLMs
PulseAugur coverage of LVLMs — every cluster mentioning LVLMs across labs, papers, and developer communities, ranked by signal.
-
Perceptual Flow Network and VGR enhance visual reasoning in LLMs
Researchers have developed a Perceptual Flow Network (PFlowNet) to improve visual reasoning in Large-Vision Language Models (LVLMs). PFlowNet decouples perception from reasoning and uses variational reinforcement learni…
-
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs
Researchers have introduced Persistent Visual Memory (PVM), a novel module designed to address the "Visual Signal Dilution" problem in Large Vision-Language Models (LVLMs). This issue causes visual attention to weaken a…
-
New methods enhance LLMs for fine-grained visual recognition tasks
Two new research papers propose novel methods for improving Fine-Grained Visual Recognition (FGVR) using Large Vision-Language Models (LVLMs). The first paper introduces SARE, a framework that adaptively applies reasoni…
-
Aligning with Your Own Voice: Self-Corrected Preference Learning for Hallucination Mitigation in LVLMs
Researchers are developing new frameworks to address hallucinations in large language models (LLMs). One approach, termed "LLM Psychosis," categorizes severe reality-boundary failures and proposes a diagnostic scale to …
-
New benchmark reveals LVLMs hallucinate due to text priors, proposes fine-tuning fix
Researchers have developed a new benchmark called HalluScope to investigate hallucinations in large vision-language models (LVLMs). Their findings indicate that these models often generate outputs not grounded in visual…