Particle physics models engineered for data-driven scaling laws

By PulseAugur Editorial · [1 sources] · 2026-06-19 04:00

Researchers are exploring how to engineer scaling laws for models in particle physics, drawing parallels to large language models. Unlike natural language or image domains, fundamental physics benefits from high-fidelity simulators that generate synthetic data affordably. This allows for dataset engineering to influence model scaling, favoring more data over larger parameters. For the task of classifying hadronic jets, a study demonstrated that by including diverse and task-aligned pretraining data, the scaling behavior could be shifted towards requiring more data. AI

IMPACT This research could lead to more efficient model training in scientific domains by optimizing data composition.

RANK_REASON Academic paper on scaling laws in particle physics. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
infra

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Particle physics models engineered for data-driven scaling laws

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Jan-Lucas Uslu, Kevin Greif, Daniel Whiteson, Benjamin Nachman · 2026-06-19 04:00

Towards Engineering Scaling Laws with Pretraining Data Composition

arXiv:2606.19781v1 Announce Type: cross Abstract: Neural scaling laws describe how model performance improves as a power law in compute, model size, and dataset size. While well-established for large language models, these relationships are emerging for large models in particle p…

COVERAGE [1]

Towards Engineering Scaling Laws with Pretraining Data Composition

RELATED ENTITIES

RELATED TOPICS