A user on r/MachineLearning is seeking advice regarding a significantly slow training pipeline for imitation learning in robotics. Despite using a Diffusion Transformer (DiT) model with approximately 50 million parameters and modern hardware including an NVIDIA A4500 GPU, the training throughput is only about 10 iterations per second, leading to multi-day training times. The user has observed high CPU utilization and low GPU utilization, and attempts to optimize by freezing the encoder or using synthetic data have yielded minimal improvements. AI
RANK_REASON User is asking for help with a slow training pipeline, not announcing a new model or research.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →