PulseAugur
LIVE 06:57:48
research · [1 source] ·
0
research

Sony AI releases Woosh, a sound effects foundation model competitive with open alternatives

Sony AI has released Woosh, a new foundation model for generating sound effects. The model includes components for audio encoding/decoding, text-audio alignment, and text-to-audio and video-to-audio generation. Distilled versions are also available for faster inference and lower resource usage. Evaluations indicate Woosh performs competitively with existing open-source models like StableAudio-Open and TangoFlux. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Provides a new open-source tool for generative audio research and application development.

RANK_REASON Release of an open-source foundation model for sound effects generation with accompanying paper and code.

Read on arXiv cs.AI →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 · Ga\"etan Hadjeres, Marc Ferras, Khaled Koutini, Benno Weck, Alexandre Bittar, Thomas Hummel, Zineb Lahrichi, Hakim Missoum, Joan Serr\`a, Yuki Mitsufuji ·

    Woosh: A Sound Effects Foundation Model

    arXiv:2604.01929v3 Announce Type: replace-cross Abstract: The audio research community depends on open generative models as foundational tools for building novel approaches and establishing baselines. In this report, we present Woosh, Sony AI's publicly released sound effect foun…