Enhanced Gesture Recognition Through Graph-Based Multimodal Fusion

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This study introduces an advanced framework for recognizing hand gestures from a first-person view, leveraging the integration of multimodal data including optical flow, pose, depth, and RGB video recordings. Adeptly navigating the challenges and opportunities presented by the integration of multimodal data. At its core, the framework employs two pivotal components: a cross-attention based adaptive graph convolutional network and relational graph interactions for modality fusion. The former is designed to extract features from skeleton-based gesture data, ensuring a nuanced capture of hand movements by emphasizing the interconnections within the hand's skeletal structure. The latter component innovatively models each output modality feature as a node in a fully connected relational graph, facilitating the fusion of heterogeneous data types through dynamic interactions between modalities. This approach allows for the leveraging of each data type's strengths and the mitigation of their weaknesses, significantly enhancing the system's classification accuracy and robustness. Tested on a public benchmark dataset, the framework achieved a remarkable accuracy of 98.48%, demonstrating its efficacy. Moreover, it proves resilient, maintaining strong performance (93.48% accuracy) even in scenarios where only one modality is available, highlighting its potential for real-world applications. This advancement sets a new benchmark in hand gesture recognition, promising future developments in multimodal data fusion.

Original languageBritish English
Title of host publication2024 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350366556
DOIs
StatePublished - 2024
Event14th IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024 - Hybrid, Bali, Indonesia
Duration: 19 Aug 202422 Aug 2024

Publication series

Name2024 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024

Conference

Conference14th IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024
Country/TerritoryIndonesia
CityHybrid, Bali
Period19/08/2422/08/24

Keywords

  • Action Recognition
  • Cross-Attention Fusion
  • Graph-Based Multimodal fusion
  • Relational Graph Interactions
  • Skeleton-based Action Recognition

Fingerprint

Dive into the research topics of 'Enhanced Gesture Recognition Through Graph-Based Multimodal Fusion'. Together they form a unique fingerprint.

Cite this