EA-VGG: A new approach for emotional speech classification

Shibani Hamsa Koya, Ismail Shahin, Youssef Iraqi, Ernesto Damiani, Naoufel Werghi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recent advancements in machine learning technique have resulted in efficient solutions to speech emotion recognition (SER). Conventional machine learning techniques use training and testing data from the same pool since they both have same input feature space and data distribution. However, numerous applications demand difference in data distribution of training and testing data. Hence it becomes expensive to collect training data. Here comes the need to obtain high performance learners trained with already existing similar data. Transfer learning is used to transfer information from one domain into a similar field to improve the learning capability of the model. To address this scenario, we introduce exclusively a novel paradigm for emotional speech. The newly designed emotional speech VGG transfer learning model is equipped with Affinity loss instead of categorical cross entropy. Affinity is incorporated to maximize the margin between the classes during training. The proposed Emotional Audio VGG (EA-VGG) can be fine tuned for emotion recognition, speaker identification and speech recognition from emotional voice. The proposed framework obtained an average emotion recognition accuracy of 89.40%.

Original languageBritish English
Title of host publicationInternational Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665470957
DOIs
StatePublished - 2022
Event2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022 - Male, Maldives
Duration: 16 Nov 202218 Nov 2022

Publication series

NameInternational Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022

Conference

Conference2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022
Country/TerritoryMaldives
CityMale
Period16/11/2218/11/22

Keywords

  • Emotional speech
  • Feature extraction
  • Speaker identification
  • Transfer learning

Fingerprint

Dive into the research topics of 'EA-VGG: A new approach for emotional speech classification'. Together they form a unique fingerprint.

Cite this