Where Should Knowledge Enter? A Layered Framework for Knowledge Infusion in Multimodal Iterative Generative Mo
Researchers have proposed a new framework for integrating knowledge into multimodal generative models, addressing their unreliability with structured data. The framework categorizes knowledge infusion into four distinct layers: surface, trajectory, latent, and parametric. Experiments with diffusion models demonstrated that combining these layers significantly reduces knowledge-violating outputs, achieving a 70.97% improvement over standard generation. AI
IMPACT This framework offers a structured approach to enhance the reliability and safety of multimodal generative models by better integrating domain-specific knowledge.