TY - GEN
T1 - Studying the Effect of Face Masks in Identifying Speakers using LSTM
AU - Bader, Mohamed
AU - Shahin, Ismail
AU - Ahmed, Abdelfatah
AU - Werghi, Naoufel
N1 - Funding Information:
The authors of this study would like to express their gratitude to the “University of Sharjah in the United Arab Emirates for providing financial support to this study through the competitive research project entitled Investigation and Analysis of Emirati-Accented Corpus in Neutral and Abnormal Talking Environments for Engineering Applications using Shallow and Deep Classifiers, 20020403159”.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - During the COVID-19 pandemic, it has been a standard procedure for people all around the world to use Respiratory Protection Masks (RPM) that cover both the nose and the mouth. The Consequences of wearing RPMs, those pertaining to the perception and production of spoken communication, are rapidly becoming more prominent. Nevertheless, the utilization of face masks also causes attenuation in voice signals, and this alteration affects speech-processing technologies such as Automatic Speaker Verification (ASV) and speech-to-text conversion. An intervention by a deep learning-based algorithm is considered vital to remedy the issue of inappropriate exploitation of speaker-based technology. Therefore, in the proposed framework, a speaker identification system has been implemented to examine the effect of masks. First, the speech signals have been captured, pre-processed, and augmented by a variety of data augmentation techniques. Afterward, different 'Mel-Frequency Cepstral Coefficients' (MFCC) features have been extracted to be fed into a 'Long Short-Term Memory' (LSTM) for identifying speakers. The system's overall performance has been assessed using accuracy, precision, recall, and Fl-score, which yields 93%, 93.3%, 92.2%, and 92.8%, respectively. The obtained results are still in a rudimentary phase, and they are subjected to further enhancements in the future by data expansion and exploitation of multiple optimization techniques.
AB - During the COVID-19 pandemic, it has been a standard procedure for people all around the world to use Respiratory Protection Masks (RPM) that cover both the nose and the mouth. The Consequences of wearing RPMs, those pertaining to the perception and production of spoken communication, are rapidly becoming more prominent. Nevertheless, the utilization of face masks also causes attenuation in voice signals, and this alteration affects speech-processing technologies such as Automatic Speaker Verification (ASV) and speech-to-text conversion. An intervention by a deep learning-based algorithm is considered vital to remedy the issue of inappropriate exploitation of speaker-based technology. Therefore, in the proposed framework, a speaker identification system has been implemented to examine the effect of masks. First, the speech signals have been captured, pre-processed, and augmented by a variety of data augmentation techniques. Afterward, different 'Mel-Frequency Cepstral Coefficients' (MFCC) features have been extracted to be fed into a 'Long Short-Term Memory' (LSTM) for identifying speakers. The system's overall performance has been assessed using accuracy, precision, recall, and Fl-score, which yields 93%, 93.3%, 92.2%, and 92.8%, respectively. The obtained results are still in a rudimentary phase, and they are subjected to further enhancements in the future by data expansion and exploitation of multiple optimization techniques.
KW - COVID-19
KW - Deep Learning
KW - Face Masks
KW - Speaker Identification
UR - http://www.scopus.com/inward/record.url?scp=85146368489&partnerID=8YFLogxK
U2 - 10.1109/ICECTA57148.2022.9990479
DO - 10.1109/ICECTA57148.2022.9990479
M3 - Conference contribution
AN - SCOPUS:85146368489
T3 - 2022 International Conference on Electrical and Computing Technologies and Applications, ICECTA 2022
SP - 99
EP - 102
BT - 2022 International Conference on Electrical and Computing Technologies and Applications, ICECTA 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 International Conference on Electrical and Computing Technologies and Applications, ICECTA 2022
Y2 - 23 November 2022 through 25 November 2022
ER -