Sony AI has released Woosh, a new foundation model for generating sound effects. The model includes components for audio encoding/decoding, text-audio alignment, and text-to-audio and video-to-audio generation. Distilled versions are also available for faster inference and lower resource usage. Evaluations indicate Woosh performs competitively with existing open-source models like StableAudio-Open and TangoFlux. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Provides a new open-source tool for generative audio research and application development.
RANK_REASON Release of an open-source foundation model for sound effects generation with accompanying paper and code.