Synthetic Homes: A Multimodal Generative AI Pipeline for Residential Building Data Generation under Data Scarcity
Researchers have developed a multimodal generative AI pipeline called Synthetic Homes to create realistic residential building datasets. This framework addresses data scarcity in building energy modeling by integrating image, tabular, and simulation components. The system generates synthetic data from public records and images, demonstrating over 95% overlap with national datasets for key variables and outperforming GPT-based models in visual processing for building data. AI
IMPACT Enables scalable downstream tasks like energy modeling and urban simulation by reducing reliance on costly or restricted data sources.