Feature Fusion for Human Activity Recognition using Parameter-Optimized Multi-Stage Graph Convolutional Network and Transformer Models

  • Mohammad Belal
  • , Taimur Hassan
  • , Abdelfatah Ahmed
  • , Ahmad Aljarah
  • , Nael Alsheikh
  • , Irfan Hussain

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Human activity recognition is a crucial area of research that involves understanding human movements using computer and machine vision technology. Deep learning has emerged as a powerful tool for this task, with models such as Convolutional Neural Networks (CNNs) and Transformers being employed to capture various aspects of human motion. One of the key contributions of this work is the demonstration of the effectiveness of feature fusion in improving human activity recognition accuracy, which has important implications for the development of more accurate and robust activity recognition systems. This approach addresses a limitation in the field, where the performance of existing models is often limited by their inability to capture both spatial and temporal features effectively. This work presents an approach for human activity recognition using sensory data extracted from four distinct datasets: HuGaDB, PKU-MMD, LARa, and TUG. Two models, the Parameter-Optimized Multi-Stage Graph Convolutional Network (PO-MS-GCN) and a Transformer, were trained and evaluated on each dataset to calculate accuracy and F1-score. Subsequently, the features from the last layer of each model were combined and fed into a classifier. The findings prove that PO MS-GCN outperforms state-of-the-art models in human activity recognition. Specifically, HuGaDB achieved an accuracy of 92.7% and f1-score of 95.2%, TUG achieved an accuracy of 93.2% and f1-score of 98.3%, while LARa and PKU-MMD achieved lower accuracies of 64.31% and 69%, respectively, with corresponding f1-scores of 40.63% and 48.16%. Moreover, feature fusion exceeded the PO-MS-GCN's results in PKU-MMD, LARa, and TUG datasets.

Original languageBritish English
Title of host publicationAVSS 2024 - 20th IEEE International Conference on Advanced Video and Signal-Based Surveillance
PublisherInstitute of Electrical and Electronics Engineers Inc.
Edition2024
ISBN (Electronic)9798350374285
DOIs
StatePublished - 2024
Event20th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2024 - Niagara Falls, Canada
Duration: 15 Jul 202416 Jul 2024

Conference

Conference20th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2024
Country/TerritoryCanada
CityNiagara Falls
Period15/07/2416/07/24

Fingerprint

Dive into the research topics of 'Feature Fusion for Human Activity Recognition using Parameter-Optimized Multi-Stage Graph Convolutional Network and Transformer Models'. Together they form a unique fingerprint.

Cite this