PulseAugur
LIVE 08:04:53
research · [2 sources] ·
0
research

New VeriGround model achieves reliable circuit-to-Verilog code generation

Researchers have identified a significant reliability issue in multimodal large language models (MLLMs) when generating hardware description language (HDL) code from circuit diagrams. This "Mirage" phenomenon occurs when models bypass visual input, relying instead on textual identifiers to retrieve pre-existing code templates, leading to high accuracy even with blank images. To address this, a new model called VeriGround (4B) was developed, incorporating techniques like identifier anonymization and a novel preference alignment method (D-ORPO) to ensure genuine visual grounding. VeriGround demonstrates competitive performance, outperforming baselines on anonymized inputs and maintaining a high refusal rate for invalid inputs. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Highlights a new class of failure in MLLMs for code generation, necessitating more robust evaluation and training methods.

RANK_REASON Academic paper introducing a new phenomenon and a model to address it.

Read on arXiv cs.AI →

COVERAGE [2]

  1. arXiv cs.AI TIER_1 · Guang Yang, Xing Hu, Xiang Chen, Xin Xi ·

    From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

    arXiv:2604.27969v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) are increasingly used to translate visual artifacts into code, from UI mockups into HTML to scientific plots into Python scripts. A circuit diagram can be viewed as a visual domain-specific…

  2. arXiv cs.AI TIER_1 · Xin Xi ·

    From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

    Multimodal large language models (MLLMs) are increasingly used to translate visual artifacts into code, from UI mockups into HTML to scientific plots into Python scripts. A circuit diagram can be viewed as a visual domain-specific language for hardware: it encodes timing, topolog…