TY - JOUR
T1 - Data Randomization and Cluster-Based Partitioning for Botnet Intrusion Detection
AU - Al-Jarrah, Omar Y.
AU - Alhussein, Omar
AU - Yoo, Paul D.
AU - Muhaidat, Sami
AU - Taha, Kamal
AU - Kim, Kwangjo
N1 - Funding Information:
This work was supported in part by the Khalifa University of Science, Technology and Research-Korea Institute of Science and Technology (KAIST) Institute, in part by the KAIST, Korea, and in part by the National Research Foundation of Korea through the Korea government (MSIP) under Grant NRF-2015R1A2A2A01006812. This paper was recommended by Associate Editor L. D. Xu.
Publisher Copyright:
© 2015 IEEE.
PY - 2016/8
Y1 - 2016/8
N2 - Botnets, which consist of remotely controlled compromised machines called bots, provide a distributed platform for several threats against cyber world entities and enterprises. Intrusion detection system (IDS) provides an efficient countermeasure against botnets. It continually monitors and analyzes network traffic for potential vulnerabilities and possible existence of active attacks. A payload-inspection-based IDS (PI-IDS) identifies active intrusion attempts by inspecting transmission control protocol and user datagram protocol packet's payload and comparing it with previously seen attacks signatures. However, the PI-IDS abilities to detect intrusions might be incapacitated by packet encryption. Traffic-based IDS (T-IDS) alleviates the shortcomings of PI-IDS, as it does not inspect packet payload; however, it analyzes packet header to identify intrusions. As the network's traffic grows rapidly, not only the detection-rate is critical, but also the efficiency and the scalability of IDS become more significant. In this paper, we propose a state-of-the-art T-IDS built on a novel randomized data partitioned learning model (RDPLM), relying on a compact network feature set and feature selection techniques, simplified subspacing and a multiple randomized meta-learning technique. The proposed model has achieved 99.984% accuracy and 21.38 s training time on a well-known benchmark botnet dataset. Experiment results demonstrate that the proposed methodology outperforms other well-known machine-learning models used in the same detection task, namely, sequential minimal optimization, deep neural network, C4.5, reduced error pruning tree, and random Tree.
AB - Botnets, which consist of remotely controlled compromised machines called bots, provide a distributed platform for several threats against cyber world entities and enterprises. Intrusion detection system (IDS) provides an efficient countermeasure against botnets. It continually monitors and analyzes network traffic for potential vulnerabilities and possible existence of active attacks. A payload-inspection-based IDS (PI-IDS) identifies active intrusion attempts by inspecting transmission control protocol and user datagram protocol packet's payload and comparing it with previously seen attacks signatures. However, the PI-IDS abilities to detect intrusions might be incapacitated by packet encryption. Traffic-based IDS (T-IDS) alleviates the shortcomings of PI-IDS, as it does not inspect packet payload; however, it analyzes packet header to identify intrusions. As the network's traffic grows rapidly, not only the detection-rate is critical, but also the efficiency and the scalability of IDS become more significant. In this paper, we propose a state-of-the-art T-IDS built on a novel randomized data partitioned learning model (RDPLM), relying on a compact network feature set and feature selection techniques, simplified subspacing and a multiple randomized meta-learning technique. The proposed model has achieved 99.984% accuracy and 21.38 s training time on a well-known benchmark botnet dataset. Experiment results demonstrate that the proposed methodology outperforms other well-known machine-learning models used in the same detection task, namely, sequential minimal optimization, deep neural network, C4.5, reduced error pruning tree, and random Tree.
KW - Botnet intrusion detection
KW - efficient learning
KW - ensembles
KW - feature selection
KW - machine-learning (ML)
UR - http://www.scopus.com/inward/record.url?scp=84946083260&partnerID=8YFLogxK
U2 - 10.1109/TCYB.2015.2490802
DO - 10.1109/TCYB.2015.2490802
M3 - Article
AN - SCOPUS:84946083260
SN - 2168-2267
VL - 46
SP - 1796
EP - 1806
JO - IEEE Transactions on Cybernetics
JF - IEEE Transactions on Cybernetics
IS - 8
M1 - 7312964
ER -