PulseAugur
EN
LIVE 14:43:49

Baidu releases ERNIE-Image, an open-source text-to-image model

Baidu has introduced ERNIE-Image, an open-source text-to-image generation model based on an 8B single-stream DiT architecture. The model aims to compete with closed-source systems by enhancing data pre-training and supervision quality. ERNIE-Image utilizes a multi-stage data construction pipeline, including fine-grained categorization, detailed captioning, and aesthetic assessment, to improve its foundation for complex generation tasks. Additionally, a lightweight Prompt Enhancer and an industrial-grade aesthetic model are provided to facilitate practical use and evaluation. AI

IMPACT This open-source release provides a strong foundation for text-to-image generation, potentially accelerating research and development in the AIGC community.

RANK_REASON The cluster contains a technical report detailing a new open-source model release. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Jiaxiang Liu, Zhida Feng, Pengyu Zou, Zhenyu Qian, Tianrui Zhu, Jun Xia, Yuehu Dong, Yanzheng Lin, Honglin Xiong, Anqi Chen, Yunpeng Ding, Jinghui Duan, Lin Gao, Chao Han, Tiechao He, Jiakang Hu, Ranjun Hua, Xueming Jiang, Qingli Kong, Yuting Lei, Tianyu… ·

    ERNIE-Image Technical Report

    arXiv:2605.25347v1 Announce Type: cross Abstract: We introduce ERNIE-Image, an open-source text-to-image generation model built upon an 8B single-stream DiT architecture. ERNIE-Image aims to bridge the gap between current open-source models and leading closed-source systems throu…