RAIGen: Rare Attribute Identification in Text-to-Image Generative Models
Researchers have developed RAIGen, a new framework designed to identify underrepresented attributes in text-to-image diffusion models. Unlike previous methods that focus on predefined fairness categories or general bias, RAIGen discovers rare or minority features without requiring prior knowledge of these attributes. The system utilizes Matryoshka Sparse Autoencoders and a novel metric combining neuron activation frequency with semantic distinctiveness to pinpoint these underrepresented elements. Experiments demonstrate RAIGen's ability to uncover attributes beyond standard fairness categories in models like Stable Diffusion and SDXL, and it can also be used to amplify these rare attributes during image generation. AI
IMPACT Enables more comprehensive auditing and targeted generation of diverse imagery by identifying and amplifying underrepresented attributes.