An Attention-based Model for Robust Forecasting with Missing Modality
Researchers have developed a new attention-based multimodal model designed to handle situations where some sensor data is missing during both training and inference. This model, formulated as a conditional variational autoencoder (CVAE) with a transformer backbone, learns a unified representation even with incomplete modalities. Experiments on five datasets across human trajectory prediction and robot manipulation forecasting show its effectiveness in learning from incomplete data and outperforming existing multimodal fusion methods. AI
IMPACT This model could improve the robustness of AI systems in real-world robotic applications where sensor data is often incomplete.