PulseAugur
EN
LIVE 11:18:29

New SCDBench benchmark reveals LLM struggles with smart contract decompilation

A new benchmark called SCDBench has been introduced to evaluate Large Language Models (LLMs) used for smart contract decompilation. The benchmark includes a dataset of 600 real-world Solidity contracts with paired bytecode, ground-truth source code, and semantic checkpoints. Current frontier LLMs like Claude Opus 4.7 and GPT-5.3-Codex show promise in generating structured and compilable code, but struggle with semantic consistency, with the best model only perfectly decompiling 42 contracts. The research also found that incorporating compilation repair significantly improves performance. AI

IMPACT Highlights limitations in LLM's ability to ensure semantic consistency in generated smart contracts, indicating a need for further research in this area.

RANK_REASON The cluster contains an academic paper introducing a new benchmark for evaluating LLM capabilities in a specific domain. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New SCDBench benchmark reveals LLM struggles with smart contract decompilation

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Kaihua Qin, Dawn Song, Arthur Gervais ·

    SCDBench: A Benchmark for LLM-Based Smart Contract Decompilers

    arXiv:2605.29059v1 Announce Type: cross Abstract: Smart contract decompilation aims to recover high-level source code from bytecode, but evaluating decompilers remains difficult because existing studies use narrow datasets, inconsistent metrics, and limited semantic consistency c…