Gradient boosting decision trees for cyber security threats detection based on network events logs

  • Quang Hieu Vu
  • , Dymitr Ruta
  • , Ling Cen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

31 Scopus citations

Abstract

The rapid expansion of Internet of Things (IoT) quickly evolves towards a connected network of AI-enabled, smart, multi-sensory devices that generate, consume and exchange enormous amounts of data. Stimulated by the omnipresent cloud-services and ever widening data bandwidth they aspire to control every aspect of our lives from transportation to our health, significantly increasing our reliance on the web-based services and their security. Already millions of cyber security alerts are generated every day and trigger increasingly costly investigations of the Security Operations Centres (SOC). In order to make their operation more efficient security warnings need to be reliably detected and classified based on various levels of severity, scale of potential damage or an ability to defend.We have responded to this challenge in the context of IEEE BigData Cup 2019 focused on predicting cyber security threats that require attention based on detailed logs of network activity leading to the security alert. We have developed a hybrid supervised learning ensemble model combining several state-of-the-art Extreme Gradient Boosting algorithms. Specifically, xGBoost and LightGBM model versions have been built on separate sets of features extracted from the raw logs of network events preceding the alerts and then synergically aggregated. Models' diversity imposed by algorithmic differences, complementary feature subsets, and individually optimized hyperparameters, combined with robust stratified cross-validation scheme resulted with the best true alerts detection rate yielding the AUC score in excess of 0.93, that outperformed all other 248 competing teams.

Original languageBritish English
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5921-5928
Number of pages8
ISBN (Electronic)9781728108582
DOIs
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: 9 Dec 201912 Dec 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference2019 IEEE International Conference on Big Data, Big Data 2019
Country/TerritoryUnited States
CityLos Angeles
Period9/12/1912/12/19

Fingerprint

Dive into the research topics of 'Gradient boosting decision trees for cyber security threats detection based on network events logs'. Together they form a unique fingerprint.

Cite this