Listen, Look, and Learn: Learning Without Forgetting through SAM-Audio
Researchers have developed a new method for class-incremental learning (CIL) in audio-visual settings, addressing the challenge of acquiring new knowledge without losing previously learned information. The approach integrates the SAM-Audio multimodal model by using its audio features to guide visual representations through a novel attention strategy. To further combat catastrophic forgetting, the method incorporates dual-level distillation objectives at both feature and logit levels, demonstrating superior performance on audio-visual CIL benchmarks compared to existing state-of-the-art techniques. AI
IMPACT Introduces a novel approach to audio-visual class-incremental learning, potentially improving continuous learning capabilities in multimodal AI systems.