A Late Multi-modal Fusion Model for Detecting Hybrid Spam E-mail

    Research output: Contribution to journalArticlepeer-review

    5 Scopus citations

    Abstract

    In recent years, spammers are now trying to obfuscate spam filtering systems by introducing hybrid spam email combining both image and text parts, which is more destructive and complicated compared to e-mails containing text or image only to cyber security. Traditionally, Optical Character Recognition (OCR) technology is used to eliminate the image parts of spam by transforming images into text. Although OCR scanning is a very successful technique for processing text-and-image hybrid spam, it is not an effective solution for dealing with huge quantities due to the Central Processing Unit (CPU) power required and the execution time it takes to scan e-mail files. To address this problem, this paper proposes a late multi-modal fusion model for a text-and-image hybrid spam e-mail filtering system compared to the classical early fusion detection model based on the OCR method. Convolutional Neural Network (CNN) and Continuous Bag of Words were implemented to extract features from image and text parts of hybrid spam respectively, whereas generated features were fed to the sigmoid layer and machine learning based classifiers to determine the e-mail ham or spam. The obtained two classification probability values were fed to a late decision model and the concluding classification decisions were analyzed with text-only classifiers based on the OCR technique in terms of prediction accuracy as well as computational efficiency. The experimental results show that the proposed late fusion model is highly superior to the benchmark in terms of execution time whereas other performance metrics are adequate. These findings reveal the superiorities of using CNN rather than OCR to detect hybrid spam e-mails.

    Original languageBritish English
    Pages (from-to)76-81
    Number of pages6
    JournalInternational Journal of Computer Theory and Engineering
    Volume15
    Issue number2
    DOIs
    StatePublished - May 2023

    Keywords

    • Convolutional neural network
    • cyber security
    • hybrid spam e-mail
    • late fusion
    • spam filtering

    Fingerprint

    Dive into the research topics of 'A Late Multi-modal Fusion Model for Detecting Hybrid Spam E-mail'. Together they form a unique fingerprint.

    Cite this