Researchers have developed a new framework called VLM-IMI that adapts large vision-language models for generative low-light image enhancement. This approach utilizes both normal-light images and textual descriptions to guide the restoration process, aiming for semantically informed and precise illumination improvements. The system incorporates a diffusion model guided by these instruction priors and includes a fusion module to align image and text features. Notably, VLM-IMI supports iterative refinement of instructions during inference and allows for direct manual control by users. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a novel method for generative low-light image enhancement by leveraging vision-language models and user instructions.
RANK_REASON This is a research paper detailing a new framework for image enhancement using vision-language models.