Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling
Researchers have introduced Scone, a novel method for subject-driven image generation that addresses the limitation of distinguishing between multiple subjects. Scone integrates composition and distinction capabilities, using an understanding expert to guide a generation expert in preserving subject identity. The method employs a two-stage training process and introduces a new benchmark, SconeEval, to assess both composition and distinction. AI
IMPACT Enhances the ability of AI models to accurately generate images with multiple distinct subjects, improving realism and control.