I was fine-tuning a language model on Arabic. The loss was perfect. It spoke Chinese.
A new open-source tool called TrainSafe has been developed to address issues encountered during the fine-tuning of language models. The tool was created after the developer experienced a model fine-tuned on Arabic unexpectedly generating Chinese text, highlighting that low loss metrics do not guarantee successful training. TrainSafe integrates with HuggingFace and TRL pipelines, performing checks for language drift, output length, repetition, prompt echoing, and format consistency at each evaluation checkpoint. If the model's performance degrades below a set threshold, TrainSafe can halt the training process and identify the last stable checkpoint. AI
IMPACT Provides developers with a tool to catch critical errors during LLM fine-tuning, saving compute and time.