PulseAugur
EN
LIVE 10:45:24

New Dataset Automates Android Malware Source Code Collection

Researchers have developed MASCOT-Android, a new dataset and automated pipeline for collecting Android malware source code from GitHub. The system utilizes a LinearSVC classifier trained on TF-IDF features from README documents to identify malware repositories with high accuracy. This approach significantly reduces the cost and effort associated with manual review, enabling scalable discovery of malware source code. AI

RANK_REASON This is a research paper detailing a new dataset and automated collection pipeline for Android malware source code. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Bojing Li, Duo Zhong, Prajna Bhandary, Raguvir S, Charles Maxa, Robert J Joyce, Charles Nicholas ·

    MASCOT-Android: A Curated Dataset and Automated Collection Pipeline for Android Malware Source Code Specimens

    arXiv:2606.16072v1 Announce Type: cross Abstract: Compared with binaries and decompiled code, malware source code more directly reflects the attackers' original intent. However, the scarcity of source code and the high cost of manual review make such datasets difficult to build a…