JAVEDIT: Joint Audio-Visual Instruction-Guided Video Editing with Agentic Data Curation
Researchers have introduced JAVEdit-100k, a new large-scale dataset designed for instruction-guided joint audio-visual video editing. The dataset contains approximately 100,000 editing triplets across five categories, created using a novel generation pipeline with agent-in-the-loop quality control. To standardize evaluation, they also developed JAVEditBench, a comprehensive benchmark, and proposed JAVEdit, a baseline model that demonstrated superior performance on multiple metrics. AI
IMPACT Enables more sophisticated AI-driven video editing by providing dedicated resources for audio-visual synchronization.