RelayFormer framework unifies image and video manipulation localization

By PulseAugur Editorial · [1 sources] · 2026-06-11 04:00

Researchers have introduced RelayFormer, a novel framework designed to improve the localization of manipulated regions in images and videos. This unified approach addresses challenges related to resolution diversity and the separate handling of image and video data by existing methods. RelayFormer utilizes Global Local Relay (GLR) tokens and a relay-based attention mechanism to efficiently exchange contextual information while preserving fine-grained manipulation artifacts. AI

IMPACT Introduces a unified approach for visual manipulation localization, potentially improving efficiency and accuracy in detecting altered media.

RANK_REASON This is a research paper describing a new technical framework. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Wen Huang, Jiarui Yang, Tao Dai, Jiawei Li, Shaoxiong Zhan, Bin Wang, Shu-Tao Xia · 2026-06-11 04:00

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

arXiv:2508.09459v3 Announce Type: replace-cross Abstract: Visual manipulation localization (VML) aims to identify tampered regions in images and videos, a task that has become increasingly challenging with the rise of advanced editing tools. Existing methods face two central issu…

COVERAGE [1]

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

RELATED TOPICS