Two new research papers explore advanced techniques for speech enhancement using generative models. The first paper introduces Audio-visual Contrastive Alignment (AVCA) to improve diffusion-based speech enhancement by enforcing stronger audio-visual correlation, showing gains in interference suppression and signal reconstruction, particularly at low signal-to-noise ratios. The second paper proposes a novel skip-free backbone for flow-matching speech enhancement, guided by Latent Representation Alignment (LRA) with a Descript Audio Codec, which aims to preserve clean speech representations and enable efficient few-step inference. AI
IMPACT These papers advance generative model techniques for speech enhancement, potentially improving audio quality in noisy environments and enabling more efficient real-time applications.
RANK_REASON Two academic papers published on arXiv detailing new methods for speech enhancement.
- alphaXiv
- arXiv
- CatalyzeX Code Finder for Papers
- CORE Recommender
- DagsHub
- Descript Audio Codec
- Gotit.pub
- Hugging Face
- Influence Flower
- Latent Representation Alignment
- Mostafa Sadeghi
- ScienceCast
- U-Net
- VoiceBank+DEMAND
- WSJ0-CHiME3
AI-generated summary · Google Gemini · from 3 sources. How we write summaries →