Researchers have introduced BatteryPass-12K, the first dataset designed for classifying digital battery passport conformance, in anticipation of the EU's upcoming battery regulation. They evaluated 22 language models, finding that GPT-5.4 achieved the highest performance in zero-shot inference. The study also revealed that few-shot examples significantly boost performance, and that scaling model parameters does not always guarantee better results, as some smaller models outperformed larger ones. Prompt-injection attacks were found to degrade model performance on this task. AI
IMPACT New dataset and model evaluations may inform development of AI for regulatory compliance in the battery sector.
RANK_REASON Academic paper introducing a new dataset and evaluating language models on a novel task.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →