Two-stage multi-datasource machine learning for attack technique and lifecycle detection

Ying Dar Lin, Shin Yi Yang, Didik Sudyana*, Fietyata Yudha, Yuan Cheng Lai, Ren Hung Hwang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Intrusion detection systems (IDS) have increasingly adopted machine learning (ML) techniques to enhance their ability to detect a wide range of attack variants. However, the traditional focus in current research primarily revolves around identifying specific attack types or techniques using a single data source. However, this approach lacks a holistic perspective on attacks, which can result in missed detections. To improve the effectiveness of responding to detected attacks, it is essential to identify them based on their lifecycles and incorporate information from multiple data sources. In this study, we present three distinct approaches for detecting attack lifecycles, each leveraging different ML methodologies: a single-stage ML model, a two-stage ML+ML approach, and ML with sequence matching (ML+SM). Simultaneously, we explore the benefits of utilizing multiple data sources, including network traffic, system logs, and host statistics, to enhance technique detection capabilities. Our evaluation of these methods reveals that on lifecycle detection, the two-stage ML+ML approach outperforms the others, achieving an impressive F1 score of 0.994. In contrast, the single-stage and ML+SM methods yield F1 scores of 0.887 and 0.189, respectively. Furthermore, the integration of multiple data sources proves highly advantageous, with the combination of all three sources yielding the highest F1 score of 0.922 on technique detection.

Original languageEnglish
Article number103859
JournalComputers and Security
Volume142
DOIs
StatePublished - Jul 2024

Keywords

  • Attack lifecycle detection
  • Ml-based IDS
  • Multi-datasource IDS
  • Two-stage lifecycle detection

Fingerprint

Dive into the research topics of 'Two-stage multi-datasource machine learning for attack technique and lifecycle detection'. Together they form a unique fingerprint.

Cite this