Optimizing AI for Mobile Malware Detection by Self-Built-Dataset GAN Oversampling and LGBM

Ortal Dayan, Lior Wolf, Fang Wang, Yaniv Harel

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    3 Scopus citations

    Abstract

    The cyber detection industry focuses on analyzing the behavior of threats in order to develop IOCs and triggers. This process makes the detection always behind the attackers, as there is an analysis time between the attack tool launch and the detection ability. To address the challenges, a dedicated Sandbox environment was built, and thousands of mobile devices' samples were tested, resulted in creation of an up-to-date training dataset that is not based on the attacks analysis. With this dataset, the research focus was directed towards optimizing the AI methodology to achieve the best detection rates for a compromised mobile device. A CupolaGAN was implemented to oversample dataset and to compare results obtained from training LGBM models on both original imbalanced dataset and oversampled dataset. Classification scores on the oversampled data increase by maximum of 0.47+/-0.37%. The performance of the fine-tuned model using Optuna on the balanced data reaches 99.36+/-0.19% accuracy.

    Original languageBritish English
    Title of host publicationProceedings of the 2023 IEEE International Conference on Cyber Security and Resilience, CSR 2023
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages60-65
    Number of pages6
    ISBN (Electronic)9798350311709
    DOIs
    StatePublished - 2023
    Event3rd IEEE International Conference on Cyber Security and Resilience, CSR 2023 - Hybrid, Venice, Italy
    Duration: 31 Jul 20232 Aug 2023

    Publication series

    NameProceedings of the 2023 IEEE International Conference on Cyber Security and Resilience, CSR 2023

    Conference

    Conference3rd IEEE International Conference on Cyber Security and Resilience, CSR 2023
    Country/TerritoryItaly
    CityHybrid, Venice
    Period31/07/232/08/23

    Keywords

    • CupolaGAN
    • cybersecurity
    • LightGBM
    • malware detection
    • oversampling
    • Sandbox

    Fingerprint

    Dive into the research topics of 'Optimizing AI for Mobile Malware Detection by Self-Built-Dataset GAN Oversampling and LGBM'. Together they form a unique fingerprint.

    Cite this