Sony AI releases Woosh, a sound effects foundation model competitive with open alternatives

By PulseAugur Editorial · [1 sources] · 2026-04-30 04:00

Sony AI has released Woosh, a new foundation model for generating sound effects. The model includes components for audio encoding/decoding, text-audio alignment, and text-to-audio and video-to-audio generation. Distilled versions are also available for faster inference and lower resource usage. Evaluations indicate Woosh performs competitively with existing open-source models like StableAudio-Open and TangoFlux. AI

IMPACT Provides a new open-source tool for generative audio research and application development.

RANK_REASON Release of an open-source foundation model for sound effects generation with accompanying paper and code.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Sony AI releases Woosh, a sound effects foundation model competitive with open alternatives

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Ga\"etan Hadjeres, Marc Ferras, Khaled Koutini, Benno Weck, Alexandre Bittar, Thomas Hummel, Zineb Lahrichi, Hakim Missoum, Joan Serr\`a, Yuki Mitsufuji · 2026-04-30 04:00

Woosh: A Sound Effects Foundation Model

arXiv:2604.01929v3 Announce Type: replace-cross Abstract: The audio research community depends on open generative models as foundational tools for building novel approaches and establishing baselines. In this report, we present Woosh, Sony AI's publicly released sound effect foun…

COVERAGE [1]

Woosh: A Sound Effects Foundation Model

RELATED ENTITIES

RELATED TOPICS