Researchers have developed a new multi-source cybersecurity dataset, combining system, network, and browser logs with detailed MITRE ATT&CK technique labels. This dataset, comprising 870 sessions and approximately 2.3 million events, addresses the limitations of existing datasets by providing granular malicious activity labeling. To demonstrate its utility, three Small Language Models (SLMs) – Qwen2.5-1.5B, Llama-3.2-3B, and Phi-4-Mini – were fine-tuned using Low-Rank Adaptation (LoRA). The fine-tuning significantly improved performance across all models and metrics, with chunk classification accuracy jumping from around 8% to 90-97%, though technique identification remained a challenge with a best exact-match accuracy of 42%. AI
IMPACT This dataset and evaluation could advance the development of more robust AI-powered cybersecurity threat detection systems.
RANK_REASON The cluster describes a new academic dataset and evaluation of existing models, fitting the research bucket. [lever_c_demoted from research: ic=1 ai=1.0]
- Atlas Ai Model
- CICAPT-IIoT
- command and control
- Llama 3.2:3b
- LMDG
- Low Rank Adaptation
- Microsoft Windows
- Mitre
- Mitre ATT&CK
- Phi-4 Mini
- Qwen2.5-1.5B
- remote access trojan
- UNSW-NB15
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →