A new arXiv paper explores pre-training objectives for foundation models in simulation-based sciences, specifically focusing on high-energy physics. The study compares supervised classification, flow-matching generation, and self-supervised masked particle modeling using the OmniLearned High Energy Physics FM framework. Results indicate that pure classifier pre-training is best when labels are abundant, but combining it with masked particle modeling is highly effective in low-label scenarios. For generative tasks, flow matching must be included in pre-training for significant downstream advantages. AI
RANK_REASON The cluster contains an academic paper detailing research findings on AI model training objectives. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →