Vision-Encoder Behavioral Fingerprints of Image-to-Image Generative Models: A Training-Paradigm-Driven Taxonomy of Six Commercial APIs
A new research paper introduces a method to classify image-to-image generative models based on their training paradigms. By analyzing the behavioral fingerprints of six commercial APIs, including GPT-image-1, Gemini 2.5 Flash Image, and SDXL img2img, the study found that models trained with an edit-based approach cluster separately from those adapted at sampling time (text-to-image base models). This classification was achieved using a content-adaptive adversarial perturbation pipeline and scoring outputs against clean references with a frozen DINOv2 ViT-B/14 token distance. AI
IMPACT This research provides a novel method for understanding and categorizing image-to-image generative models, potentially aiding in their evaluation and development.