TY - GEN
T1 - Hybrid CNN-LSTM Speaker Identification Framework for Evaluating the Impact of Face Masks
AU - Bader, Mohamed
AU - Shahin, Ismail
AU - Ahmed, Abdelfatah
AU - Werghi, Naoufel
N1 - Funding Information:
ACKNOWLEDGMENTS “The University of Sharjah in the United Arab Emirates provided financial support for this study, and its authors would like to express their gratitude for this through the competitive research project entitled: Investigation and Analysis of Emirati-Accented Corpus in Neutral and Abnormal Talking Environments for Engineering Applications using Shallow and Deep Classifiers, 20020403159”.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Following the declaration of COVID-19 as a worldwide pandemic, hindering a multitude number of lives, face mask exploitation has become extremely crucial to barricade the emanation of the virus. The masks available in the market are of various sorts and materials and tend to affect the speaker's vocal characteristics. As a result, optimum communication may be hampered. In the proposed framework, a speaker identification model has been employed. Also, the speech corpus has been captured. Then, the spectrograms were obtained and passed through a two-stage pre-processing. The first stage includes the audio samples. In contrast, the second stage has targeted the spectrograms. Afterward, the generated spectrograms were passed into a hybrid Convolutional Neural Network- Long Short-Term Memory (CNN-LSTM) model to perform the classification. Our proposed framework has shown its capability to identify speakers while they are wearing face masks. Moreover, the system has been evaluated on the collected dataset, where it has attained 92.7%, 92.62%, 87.71%, and 88.26% in terms of accuracy, precision, recall, and F1-score, respectively. The acquired findings are still preliminary and will be refined further in the future by data expansion and the employment of numerous optimization approaches.
AB - Following the declaration of COVID-19 as a worldwide pandemic, hindering a multitude number of lives, face mask exploitation has become extremely crucial to barricade the emanation of the virus. The masks available in the market are of various sorts and materials and tend to affect the speaker's vocal characteristics. As a result, optimum communication may be hampered. In the proposed framework, a speaker identification model has been employed. Also, the speech corpus has been captured. Then, the spectrograms were obtained and passed through a two-stage pre-processing. The first stage includes the audio samples. In contrast, the second stage has targeted the spectrograms. Afterward, the generated spectrograms were passed into a hybrid Convolutional Neural Network- Long Short-Term Memory (CNN-LSTM) model to perform the classification. Our proposed framework has shown its capability to identify speakers while they are wearing face masks. Moreover, the system has been evaluated on the collected dataset, where it has attained 92.7%, 92.62%, 87.71%, and 88.26% in terms of accuracy, precision, recall, and F1-score, respectively. The acquired findings are still preliminary and will be refined further in the future by data expansion and the employment of numerous optimization approaches.
KW - CNN
KW - COVID-19
KW - Face Masks
KW - LSTM
KW - Speaker Identification
UR - http://www.scopus.com/inward/record.url?scp=85146366898&partnerID=8YFLogxK
U2 - 10.1109/ICECTA57148.2022.9990138
DO - 10.1109/ICECTA57148.2022.9990138
M3 - Conference contribution
AN - SCOPUS:85146366898
T3 - 2022 International Conference on Electrical and Computing Technologies and Applications, ICECTA 2022
SP - 118
EP - 121
BT - 2022 International Conference on Electrical and Computing Technologies and Applications, ICECTA 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 International Conference on Electrical and Computing Technologies and Applications, ICECTA 2022
Y2 - 23 November 2022 through 25 November 2022
ER -