Brief

last 24h

[3/3] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 9h

MODE: Modality-Decomposed Expert-Level Mixed-Precision Quantization for MoE Multimodal LLMs

Researchers have introduced MODE, a novel quantization framework designed to reduce the significant memory costs associated with Mixture-of-Experts Multimodal Large Language Models (MoE-MLLMs). The framework addresses biases in expert importance estimation that hinder performance in existing methods. By decomposing expert selection frequency by modality and filtering redundant vision tokens, MODE aims to improve the accuracy of quantization, especially for text-critical experts. Experiments demonstrate that MODE achieves substantial compression, with minimal performance loss even at extreme bit-width settings. AI

IMPACT Reduces memory footprint for MoE-MLLMs, potentially enabling wider deployment and experimentation with these powerful models.
RESEARCH · arXiv cs.CV English(EN) · 1w · [2 sources]

Mind the Gap: Disentangling Performance Bottlenecks in Video Instance Segmentation

Researchers have developed a new diagnostic framework to analyze performance bottlenecks in video instance segmentation (VIS). This framework uses an Integer Linear Program (ILP) to isolate error sources from classification, segmentation, and tracking objectives. The analysis revealed that tracking instability is a major issue for online VIS methods, especially in longer videos or denser scenes, and that stronger backbones do not significantly improve tracking performance. AI

IMPACT Provides a systematic foundation for improving robust long-term temporal association in video instance segmentation.
TOOL · arXiv cs.LG English(EN) · 2w

Solving Integer Linear Programming with Parallel Tempering

Researchers have developed a novel solver-free framework for tackling Integer Linear Programming (ILP) problems, which are common in combinatorial optimization. This new method directly explores feasible regions without relying on traditional solvers or machine learning training. It utilizes a Locally-Balanced Proposal for its transition kernel and incorporates Parallel Tempering, including a new penalty tempering technique that adjusts constraint barriers. The framework demonstrates superior performance compared to established solvers like SCIP and Gurobi on several benchmarks, showing greater robustness to distribution shifts than learning-based approaches. AI

Brief

MODE: Modality-Decomposed Expert-Level Mixed-Precision Quantization for MoE Multimodal LLMs

Mind the Gap: Disentangling Performance Bottlenecks in Video Instance Segmentation

Solving Integer Linear Programming with Parallel Tempering