Machine Learning with Variational AutoEncoder for Imbalanced Datasets in Intrusion Detection

Ying Dar Lin, Zi Qiang Liu, Ren Hung Hwang*, Van Linh Nguyen, Po Ching Lin, Yuan Cheng Lai

*此作品的通信作者

研究成果: Article同行評審

18 引文 斯高帕斯(Scopus)

摘要

As a result of the explosion of security attacks and the complexity of modern networks, machine learning (ML) has recently become the favored approach for intrusion detection systems (IDS). However, the ML approach usually faces three challenges: massive attack variants, imbalanced data issues, and appropriate data segmentation. Improper handling of the issues will significantly degrade ML performance, e.g., resulting in high false-negative and low recall rates. Despite many efforts have done in the literature, detecting security attacks in a complicated network environment with imperfect data collection is still an open issue. This work proposes a machine learning framework with a combination of a variational autoencoder and multilayer perceptron model to deal with imbalanced datasets and detect the explosion of attack variants on the Internet. The detection engine also includes an efficient range-based sequential search algorithm to address the segmentation challenge in data pre-processing from multiple sources (network packets, system/statistic logs) effectively. Our work is the first attempt to demonstrate the effect of using an appropriate combination of ML models for boosting IDS detection capability in a heterogeneous environment, where data collection imperfection is common. Experimental results on a public system log dataset (e.g., HDFS) show that our method gains approximately as much as 97% on F1 score and 98% on recall rate, a promising result compared to the same measurement of other solutions. Even better, we found that the proposed treatment of imbalanced datasets can improve up to 35% on the F1 score and 27% on recall rate. The testing results also indicate that our model can detect new attack variants.

原文English
頁(從 - 到)15247-15260
頁數14
期刊IEEE Access
10
DOIs
出版狀態Published - 2022

指紋

深入研究「Machine Learning with Variational AutoEncoder for Imbalanced Datasets in Intrusion Detection」主題。共同形成了獨特的指紋。

引用此