Open-source LLMs fall short on complex cyber threat intelligence classification

By PulseAugur Editorial · [2 sources] · 2026-06-16 17:04

A new research paper evaluates the performance of seven open-source large language models (LLMs) on classifying complex cyber threat intelligence (CTI) reports. The study constructed a dataset of 2,076 human-annotated sentences mapped to 114 MITRE ATT&CK techniques. The highest-performing LLM achieved a micro-averaged F1 score of 0.22, indicating that current open-source LLMs are not yet sufficient for production-grade ATT&CK classification. The research found a positive correlation between LLM parameter size and performance, but prompt strategy and temperature did not yield significant gains. AI

IMPACT Current open-source LLMs demonstrate insufficient capability for complex cyber threat intelligence classification, highlighting a need for further research and development in this domain.

RANK_REASON The cluster contains an academic paper evaluating LLM performance on a specific task.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Open-source LLMs fall short on complex cyber threat intelligence classification

COVERAGE [2]

arXiv cs.LG TIER_1 English(EN) · Ahmed Ryan, Saad Sakib Noor, Md Erfan, Shaswata Mitra, Sudip Mittal, Md Rayhanur Rahman · 2026-06-17 04:00

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

arXiv:2606.18166v1 Announce Type: cross Abstract: Classifying Cyber Threat Intelligence (CTI) using MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) is essential for proactive defense, but historically required extensive human effort. Pre-Large Language Mo…
arXiv cs.LG TIER_1 English(EN) · Md Rayhanur Rahman · 2026-06-16 17:04

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

Classifying Cyber Threat Intelligence (CTI) using MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) is essential for proactive defense, but historically required extensive human effort. Pre-Large Language Model (LLM) automation sped up this process, but could n…

COVERAGE [2]

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

RELATED ENTITIES

RELATED TOPICS