Machine-learning-based feature selection techniques for large-scale network intrusion detection

O. Y. Al-Jarrah, A. Siddiqui, M. Elsalamouny, P. D. Yoo, S. Muhaidat, K. Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

78 Scopus citations

Abstract

Nowadays, we see more and more cyber-attacks on major Internet sites and enterprise networks. Intrusion Detection System (IDS) is a critical component of such infrastructure defense mechanism. IDS monitors and analyzes networks' activities for potential intrusions and security attacks. Machine-learning (ML) models have been well accepted for signature-based IDSs due to their learn ability and flexibility. However, the performance of existing IDSs does not seem to be satisfactory due to the rapid evolution of sophisticated cyber threats in recent decades. Moreover, the volumes of data to be analyzed are beyond the ability of commonly used computer software and hardware tools. They are not only large in scale but fast in/out in terms of velocity. In big data IDS, the one must find an efficient way to reduce the size of data dimensions and volumes. In this paper, we propose novel feature selection methods, namely, RF-FSR (Random Forest-Forward Selection Ranking) and RF-BER (Random Forest-Backward Elimination Ranking). The features selected by the proposed methods were tested and compared with three of the most well-known feature sets in the IDS literature. The experimental results showed that the selected features by the proposed methods effectively improved their detection rate and false-positive rate, achieving 99.8% and 0.001% on well-known KDD-99 dataset, respectively.

Original languageBritish English
Title of host publicationProceedings 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops, ICDCSW 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages177-181
Number of pages5
ISBN (Electronic)9781479941810
DOIs
StatePublished - 29 Aug 2014
Event2014 IEEE 34th International Conference on Distributed Computing Systems Workshops, ICDCSW 2014 - Madrid, Spain
Duration: 30 Jun 20143 Jul 2014

Publication series

NameProceedings - International Conference on Distributed Computing Systems
Volume30-June-2014

Conference

Conference2014 IEEE 34th International Conference on Distributed Computing Systems Workshops, ICDCSW 2014
Country/TerritorySpain
CityMadrid
Period30/06/143/07/14

Keywords

  • Feature selection
  • Intrusion detection system
  • Machine learning
  • Random forest

Fingerprint

Dive into the research topics of 'Machine-learning-based feature selection techniques for large-scale network intrusion detection'. Together they form a unique fingerprint.

Cite this