SketchSong: Hierarchical Song Generation with Sketch Planning and Fine-Grained Multi-Track Modeling
Researchers have developed two new frameworks, SegTune and SketchSong, to enhance the control and structure of AI-generated music. SegTune utilizes a Diffusion Transformer to allow for fine-grained control by aligning local descriptions to specific song segments, improving musicality and controllability. SketchSong employs a hierarchical approach with sketch planning and multi-track modeling to address arrangement coherence and the distinct roles of musical parts, outperforming baselines in objective and human evaluations. AI
IMPACT These frameworks offer more sophisticated control over AI music generation, potentially enabling new creative tools for musicians and producers.