MatchLM2Lite: A Scalable MLLM-to-Lite Framework for Reproduced Content Identification
Researchers have developed MatchLM2Lite, a framework designed to identify reproduced video content efficiently. This system uses a distilled multimodal large language model (MLLM) to achieve low-latency, high-throughput inference. The MatchLM2Lite framework, comprising MatchLM and MatchLite modules, has demonstrated a significant improvement in F1-score compared to previous models while drastically reducing computational costs. Its deployment has successfully lowered the rate of reproduced video views on a platform by 2.5% without negatively impacting user engagement. AI