LLaVA-1.5
PulseAugur coverage of LLaVA-1.5 — every cluster mentioning LLaVA-1.5 across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
New framework combats catastrophic forgetting in MLLMs
Researchers have introduced Curvature-Guided Mixing (CGM), a new framework designed to improve the adaptation of Multimodal Large Language Models (MLLMs). This method addresses the issue of catastrophic forgetting, wher…
-
New ALVTS method boosts LVLM efficiency with adaptive token selection
Researchers have introduced Adaptive Layer-wise Visual Token Selection (ALVTS), a new framework designed to improve the efficiency of Large Vision-Language Models (LVLMs). Unlike previous methods that permanently discar…
-
Vision-language model predicts coastlines as polylines
Researchers have developed CoastlineVLM-7B, a vision-language model designed to directly predict coastlines as polylines rather than segmentation masks. This approach, built on the GeoChat-7B/LLaVA-1.5 architecture, foc…
-
VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
Researchers have introduced VG-CoT, a new dataset designed to improve the trustworthiness of Large Vision-Language Models (LVLMs). This dataset automatically links reasoning steps to specific visual evidence within imag…