Discovering similarities in malware behaviors by clustering of API call sequences

Fatima Al Shamsi, Wei Lee Woon, Zeyar Aung

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

New genres of malware are evading detection by using polymorphism, obfuscation and encryption techniques. Hence, new strategies are needed to overcome the limitations of current malware analysis practices. In this paper, we propose an unsupervised learning (clustering) framework to complement the supervised learning (i.e., classifier-based malware detection) approach. We cluster malware instances to discover similarities in their dynamic behaviors and to detect new malware families. For that, we utilize Application Programming Interface (API) call sequences to represent the behaviors of malware in dynamic runtime environment. We investigate three sequence comparison algorithms, namely, Optimal Matching (OM), Longest Common Subsequence (LCS), and Longest Common Prefix (LCP) for calculating sequence–sequence distances to be used for hierarchical clustering. Among the three algorithms, LCP is found to be both the most effective in terms of clustering quality and the most efficient in terms of time complexity (linear-time).

Original languageBritish English
Title of host publicationNeural Information Processing - 25th International Conference, ICONIP 2018, Proceedings
EditorsSeiichi Ozawa, Andrew Chi Sing Leung, Long Cheng
PublisherSpringer Verlag
Pages122-133
Number of pages12
ISBN (Print)9783030042110
DOIs
StatePublished - 2018
Event25th International Conference on Neural Information Processing, ICONIP 2018 - Siem Reap, Cambodia
Duration: 13 Dec 201816 Dec 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11304 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Neural Information Processing, ICONIP 2018
Country/TerritoryCambodia
CitySiem Reap
Period13/12/1816/12/18

Keywords

  • API calls
  • Clustering
  • Malware
  • Malware patterns

Fingerprint

Dive into the research topics of 'Discovering similarities in malware behaviors by clustering of API call sequences'. Together they form a unique fingerprint.

Cite this