Bergson: An Open Source Library for Data Attribution
A new open-source library called Bergson has been released to facilitate data attribution in machine learning. This library aims to simplify the process of explaining model behavior by tracing its influence back to the training data. Bergson offers scalable techniques for large language models and pre-training datasets, including support for distributed training and on-disk gradient stores. It also provides open-source implementations of three prominent data attribution methods: MAGIC, SOURCE, and TrackStar. AI
IMPACT Enables researchers to more easily debug models and curate training datasets by providing scalable tools for data attribution.