PulseAugur
实时 07:46:03

HPC trace merging framework expands hardware counter coverage for ML

Researchers have developed a new heuristic-based method to merge High-Performance Computing (HPC) execution traces, aiming to expand the coverage of hardware counters available for machine learning-based performance prediction. This technique addresses the limitation of collecting a restricted set of hardware counters simultaneously by merging traces from multiple runs, each with different counters. The approach matches computation bursts across executions using MPI structure, timing, and communication patterns to create a unified dataset with a richer feature space for training ML models without manual counter selection. Validation on the MareNostrum5 machine demonstrated that the merged counters maintain acceptable accuracy for various applications and kernels. AI

影响 Enables more comprehensive hardware counter data for ML models, potentially improving the accuracy of HPC performance predictions.

排序理由 Publication of an academic paper on a novel methodology for HPC trace analysis. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

HPC trace merging framework expands hardware counter coverage for ML

报道来源 [1]

  1. arXiv cs.LG TIER_1 English(EN) · Marta Garcia-Gasulla ·

    Heuristic-Based Merging of HPC Traces to Extend Hardware Counter Coverage

    This work extends a framework for predicting the performance of High-Performance Computing (HPC) workloads using Machine Learning (ML). A common limitation in performance modeling is the restricted number of hardware counters that can be collected simultaneously. To address this,…