Researchers have introduced BatteryPass-12K, the first dataset designed for classifying digital battery passport conformance, in anticipation of the EU's upcoming battery regulation. They evaluated 22 language models, finding that GPT-5.4 achieved the highest performance in zero-shot inference. The study also revealed that few-shot examples significantly boost performance, and that scaling model parameters does not always guarantee better results, as some smaller models outperformed larger ones. Prompt-injection attacks were found to degrade model performance on this task. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT New dataset and model evaluations may inform development of AI for regulatory compliance in the battery sector.
RANK_REASON Academic paper introducing a new dataset and evaluating language models on a novel task.