Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data
Researchers have developed a new method called Gap-K% to detect pretraining data used in large language models. This technique analyzes the gap between a model's top prediction and the actual target token, leveraging the gradient signals that are penalized during training. By incorporating local token correlations, Gap-K% significantly outperforms existing methods on benchmarks like WikiMIA and MIMIR, offering a more robust approach to identifying training data. AI
IMPACT Enhances transparency and accountability in LLM development by providing a tool to identify training data sources.