Brief · PulseAugur

RESEARCH · arXiv cs.LG English(EN) · 4d · [2 sources]

Influence Dynamics and Stagewise Data Attribution

Two new research papers explore methods for understanding how individual data points influence the training of large machine learning models. The first paper introduces a framework for "stagewise data attribution," suggesting that the influence of data samples changes dynamically throughout the model's learning process, particularly in language models. The second paper proposes the "Mirrored Influence Hypothesis," which offers a more computationally efficient way to estimate data influence by reformulating the problem and leveraging forward passes, applicable to various scenarios including diffusion models and language models. AI

IMPACT These papers introduce new theoretical frameworks and computational methods for understanding data influence in ML models, potentially improving model trustworthiness and debugging capabilities.

Myeongseob Ko
Mirrored Influence Hypothesis
arXiv
Jin Hwa Lee