LLaVA-1.5-7B
PulseAugur coverage of LLaVA-1.5-7B — every cluster mentioning LLaVA-1.5-7B across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
Vision-Language Models struggle with classroom engagement recognition
A new benchmark study evaluated five Vision-Language Models (VLMs) for their ability to recognize classroom engagement in zero-shot settings. The models, including GPT-4o and LLaVA-1.5-7B, performed poorly on individual…
-
New pruning techniques promise smaller models and faster training
Researchers have developed new methods for pruning neural networks and datasets to improve efficiency. DCP-Prune focuses on ultra-low token pruning for vision models, achieving high performance with significantly fewer …
-
New methods drastically cut VLM visual tokens, boosting efficiency
Researchers have developed three new methods to significantly compress the visual tokens used by large vision-language models (VLMs), aiming to reduce computational overhead and improve inference speed. InfoMerge uses t…
-
AI fine-tuned for bridge damage assessment and repair priority scoring
Researchers have developed a method to automate bridge damage assessment and repair priority scoring using fine-tuned Vision-Language Models (VLMs). By training LLaVA-1.5-7B with a curated dataset of bridge images and i…
-
Apple researchers balance image captioning with new RL framework
Apple researchers have developed BalCapRL, a new framework for reinforcement learning-based image captioning using multimodal large language models. This approach aims to balance multiple caption quality dimensions, inc…
-
New framework uses foundation models for car interior object detection
Researchers have developed a novel framework called ODAL for object detection and localization within car interiors, designed to overcome the computational limitations of in-vehicle systems. This framework splits proces…
-
Researchers analyze metric unreliability in multimodal machine unlearning
Researchers have identified significant unreliability in current evaluation metrics for machine unlearning in Vision-Language Models (VLMs). Analysis of 36 unlearned LLaVA-1.5-7B models revealed that standard metrics li…