VFUSE: Virulent Feature Understanding With Sparse AutoEncoders
Researchers have developed VFUSE, a novel approach using Sparse Autoencoders (SAEs) to interpret generative protein models like RoseTTAFold3 and RFDiffusion3. This method aims to identify and understand features associated with hazardous protein designs, enhancing safety in protein engineering. VFUSE demonstrated improved interpretability by training SAEs on diffusion-transformer activations, allowing for better detection of virulence-associated features without compromising model performance. AI
IMPACT Enhances safety in AI-driven protein design by providing tools to audit models for hazardous outputs.