PulseAugur
EN
LIVE 15:39:13

LLMs show zero-shot visual creativity scoring ability, research finds

A new research paper explores the ability of multimodal large language models (LLMs) to assess visual creativity without prior training. The study tested six LLMs, including Gemini 3 Flash, Gemma 4-31B-it, and GPT-5.4 Mini, on AI-generated images and human sketches. Results showed that these models could align with human creativity ratings, with correlations ranging from .29 to .68. While the LLMs' step-by-step reasoning processes offered interpretability into their evaluation criteria, such as balancing originality and quality, this reasoning did not enhance their alignment with human judgments. AI

IMPACT Multimodal LLMs demonstrate potential for zero-shot visual creativity assessment, offering interpretable reasoning for AI-generated art and sketches.

RANK_REASON Academic paper detailing research findings on LLM capabilities.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

LLMs show zero-shot visual creativity scoring ability, research finds

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · William Orwig, Roger E. Beaty ·

    How LLMs See Creativity: Zero-Shot Scoring of Visual Creativity with Interpretable Reasoning

    arXiv:2606.29672v1 Announce Type: new Abstract: Evaluating the originality of visual images poses enduring challenges for creativity assessment. Automated scoring using AI models has proven effective in the verbal domain, yet key questions remain about evaluating visual creativit…

  2. arXiv cs.CL TIER_1 English(EN) · Roger E. Beaty ·

    How LLMs See Creativity: Zero-Shot Scoring of Visual Creativity with Interpretable Reasoning

    Evaluating the originality of visual images poses enduring challenges for creativity assessment. Automated scoring using AI models has proven effective in the verbal domain, yet key questions remain about evaluating visual creativity and understanding how models arrive at their r…