Researchers have developed COVA-X, an expanded dataset containing 10,985 synthetic conversations designed to detect multi-turn smishing attacks, particularly those targeting the elderly. This new dataset, an improvement over the initial COVA dataset, addresses several issues in the generation pipeline to provide cleaner and more comprehensive data. The expanded dataset enabled the Longformer model to outperform XGBoost in detecting smishing attempts, achieving higher accuracy and macro F1 scores, which highlights the need for larger conversational corpora to leverage the full potential of transformer models. AI
IMPACT Improved datasets and models for smishing detection can enhance cybersecurity defenses against evolving scam tactics.
RANK_REASON The cluster contains a research paper detailing a new dataset and improved model performance on a specific task.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →