Brief · PulseAugur

RESEARCH · Hugging Face Daily Papers English(EN) · 1d · [2 sources]

PRISMR: Overcoming Parse Collapse in Multimodal Listwise Ranking via Parameterized Representation Internalization

Researchers have developed PRISMR, a new framework designed to improve the performance of Large Multimodal Models (LMMs) in listwise ranking tasks, particularly in long-context scenarios. PRISMR addresses a failure mode known as 'parse collapse,' where LMMs may omit candidates or terminate rankings prematurely. The framework utilizes a hypernetwork to generate item-specific LoRA weights, enabling more robust structural conditioning without altering the base LMM. This approach has shown significant improvements in reducing parse collapse and enhancing ranking accuracy on a new multimodal review-ranking benchmark. AI

IMPACT Introduces a method to improve LMMs' ability to handle long-context multimodal ranking tasks, potentially enhancing applications requiring complex list analysis.

Hugging Face
Large Multimodal Models
arXiv
LoRA
Parameterized Representation Internalization for Semantic Multimodal Ranking