Researchers have developed a new multi-source cybersecurity dataset by combining system, network, and browser logs from Windows endpoints. This dataset, containing 870 sessions and approximately 2.3 million events, is labeled with specific MITRE ATT&CK technique IDs, addressing a gap in existing public datasets. To test its utility, three Small Language Models (SLMs) – Qwen2.5-1.5B, Llama-3.2-3B, and Phi-4-Mini – were fine-tuned using Low-Rank Adaptation (LoRA). The fine-tuning significantly improved chunk classification accuracy from around 8% to 90-97%, though technique identification remained a challenge with a best exact-match accuracy of 42%. AI
IMPACT This new dataset and fine-tuned SLM evaluations could improve multi-stage cyberattack detection capabilities.
RANK_REASON The cluster describes a new academic dataset and evaluation of existing models on that dataset, published on arXiv.
- Atlas Ai Model
- CICAPT-IIoT
- command and control
- Llama 3.2:3b
- LMDG
- Low Rank Adaptation
- Microsoft Windows
- Mitre
- Mitre ATT&CK
- Phi-4 Mini
- Qwen2.5-1.5B
- remote access trojan
- UNSW-NB15
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →