A developer has created HobbyLM, a 500 million parameter large language model, and a 330 million parameter image generator. The LLM was pre-trained on 40 billion tokens from Fineweb and then post-trained to extend its context window. The image generator was inspired by ByteDance's Dreamlite architecture and trained on distilled datasets from Midjourney, Flux, and Google's CCW3. The project utilized Claude SDK for an agentic harness to orchestrate the training process, with model weights and a playground available on Hugging Face. AI
IMPACT This release offers a new open-source LLM and image generator, potentially enabling further research and development in smaller-scale AI models.
RANK_REASON The item describes the pre-training and post-training of custom-built LLM and image generator models, which falls under research. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →