Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model
Researchers have developed a new framework called Fusion Embedding for PGPIS using a Diffusion Model (FPDM) to improve the synthesis of person images based on specified poses. This method explicitly aligns fused source-pose embeddings with target image embeddings through contrastive learning, using the learned fusion embedding as a conditioning signal for generation. FPDM integrates an Image-Pose Fusion module to learn these aligned embeddings, guiding a conditional diffusion model with source appearance, target pose, and the fusion embedding. Experiments on benchmark datasets show FPDM enhances texture fidelity and consistency across pose and source variations. AI
IMPACT Improves fidelity and consistency in AI-generated human images for applications like virtual try-on and digital avatars.