Stability AI has released Stable Audio 3, a new family of latent diffusion models for generating and editing audio. These models can produce stereo audio at 44.1 kHz with variable-length outputs and support inpainting-based editing for faster inference. The release includes open weights for smaller model scales and a technical paper detailing the architecture, which features a novel semantic-acoustic autoencoder and a diffusion transformer. AI
IMPACT Accelerates AI-driven audio production and editing capabilities with open-source models.
RANK_REASON Frontier-lab model release with open weights and technical paper. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →