Researchers have introduced Ishigaki-IDS-Bench, a new benchmark designed to evaluate the capability of large language models (LLMs) in generating Information Delivery Specification (IDS) XML from Building Information Modeling (BIM) requirements. The benchmark includes 166 expert-verified examples across various construction domains and languages, along with gold IDS files for comparison. Initial evaluations show that while LLMs can partially express information requirements, they struggle to consistently generate XML that adheres to IDS standards and IFC vocabulary constraints, with the best model achieving only 65.6% content agreement. AI
IMPACT This benchmark will help advance LLM capabilities in generating domain-specific, standardized structured data, crucial for industries like construction.
RANK_REASON The cluster describes a new academic paper introducing a benchmark for evaluating LLM performance on a specific structured data generation task.
- Building Information Modeling
- GitHub
- Hugging Face
- IFC
- Information Delivery Specification
- Ishigaki-IDS-Bench
- Large language models
- XML
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →