A comprehensive survey on applications of transformers for deep learning tasks

Saidul Islam, Hanae Elmekki, Ahmed Elsebai, Jamal Bentahar, Nagat Drawel, Gaith Rjoub, Witold Pedrycz

    Research output: Contribution to journalReview articlepeer-review

    14 Scopus citations

    Abstract

    Transformers are Deep Neural Networks (DNN) that utilize a self-attention mechanism to capture contextual relationships within sequential data. Unlike traditional neural networks and variants of Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM), Transformer models excel at managing long dependencies among input sequence elements and facilitate parallel processing. Consequently, Transformer-based models have garnered significant attention from researchers in the field of artificial intelligence. This is due to their tremendous potential and impressive accomplishments, which extend beyond Natural Language Processing (NLP) tasks to encompass various domains, including Computer Vision (CV), audio and speech processing, healthcare, and the Internet of Things (IoT). Although several survey papers have been published, spotlighting the Transformer's contributions in specific fields, architectural disparities, or performance assessments, there remains a notable absence of a comprehensive survey paper that encompasses its major applications across diverse domains. Therefore, this paper addresses this gap by conducting an extensive survey of proposed Transformer models spanning from 2017 to 2022. Our survey encompasses the identification of the top five application domains for Transformer-based models, namely: NLP, CV, multi-modality, audio and speech processing, and signal processing. We analyze the influence of highly impactful Transformer-based models within these domains and subsequently categorize them according to their respective tasks, employing a novel taxonomy. Our primary objective is to illuminate the existing potential and future prospects of Transformers for researchers who are passionate about this area, thereby contributing to a more comprehensive understanding of this groundbreaking technology.

    Original languageBritish English
    Article number122666
    JournalExpert Systems with Applications
    Volume241
    DOIs
    StatePublished - 1 May 2024

    Keywords

    • Computer vision (CV)
    • Deep learning
    • Multi-modality
    • Natural language processing (NLP)
    • Self-attention
    • Transformer

    Fingerprint

    Dive into the research topics of 'A comprehensive survey on applications of transformers for deep learning tasks'. Together they form a unique fingerprint.

    Cite this