A new diffusion transformer model called PixelDiT has been released, featuring 1.3 billion parameters and operating directly in pixel space without a VAE. This model is designed to be efficient, requiring only 4GB of VRAM, and is fully compatible with the Hugging Face Diffusers library. It also incorporates support for the Qwen encoder, enhancing its capabilities. AI
IMPACT Provides a new, efficient diffusion model for image generation tasks.
RANK_REASON Release of a new open-source model with technical specifications. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →