Time-Varying Audio Effect Modeling by End-to-End Adversarial Training
Researchers have developed a novel Generative Adversarial Network (GAN) framework for modeling time-varying audio effects without needing to extract control signals. This approach uses only input-output audio recordings, addressing limitations of traditional black-box modeling for dynamic systems. The framework employs a convolutional-recurrent architecture with a two-stage training strategy: an initial adversarial phase learns modulation behavior, followed by supervised fine-tuning with a State Prediction Network (SPN) for synchronization. A new metric for quantifying modulation accuracy has also been introduced, and experiments on a vintage phaser demonstrate the method's effectiveness. AI
IMPACT Introduces a novel GAN-based approach for modeling complex, time-varying audio effects, potentially improving audio processing and synthesis tools.