TY - JOUR
T1 - Analysis and Prediction of User Sentiment on COVID-19 Pandemic Using Tweets
AU - Yeasmin, Nilufa
AU - Mahbub, Nosin Ibna
AU - Baowaly, Mrinal Kanti
AU - Singh, Bikash Chandra
AU - Alom, Zulfikar
AU - Aung, Zeyar
AU - Azim, Mohammad Abdul
N1 - Funding Information:
This research is partially funded by Khalifa University, Abu Dhabi, United Arab Emirates.
Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2022/6
Y1 - 2022/6
N2 - The novel coronavirus disease (COVID-19) has dramatically affected people’s daily lives worldwide. More specifically, since there is still insufficient access to vaccines and no straightforward, reliable treatment for COVID-19, every country has taken the appropriate precautions (such as physical separation, masking, and lockdown) to combat this extremely infectious disease. As a result, people invest much time on online social networking platforms (e.g., Facebook, Reddit, LinkedIn, and Twitter) and express their feelings and thoughts regarding COVID-19. Twitter is a popular social networking platform, and it enables anyone to use tweets. This research used Twitter datasets to explore user sentiment from the COVID-19 perspective. We used a dataset of COVID-19 Twitter posts from nine states in the United States for fifteen days (from 1 April 2020, to 15 April 2020) to analyze user sentiment. We focus on exploiting machine learning (ML), and deep learning (DL) approaches to classify user sentiments regarding COVID-19. First, we labeled the dataset into three groups based on the sentiment values, namely positive, negative, and neutral, to train some popular ML algorithms and DL models to predict the user concern label on COVID-19. Additionally, we have compared traditional bag-of-words and term frequency-inverse document frequency (TF-IDF) for representing the text to numeric vectors in ML techniques. Furthermore, we have contrasted the encoding methodology and various word embedding schemes, such as the word to vector (Word2Vec) and global vectors for word representation (GloVe) versions, with three sets of dimensions (100, 200, and 300) for representing the text to numeric vectors for DL approaches. Finally, we compared COVID-19 infection cases and COVID-19-related tweets during the COVID-19 pandemic.
AB - The novel coronavirus disease (COVID-19) has dramatically affected people’s daily lives worldwide. More specifically, since there is still insufficient access to vaccines and no straightforward, reliable treatment for COVID-19, every country has taken the appropriate precautions (such as physical separation, masking, and lockdown) to combat this extremely infectious disease. As a result, people invest much time on online social networking platforms (e.g., Facebook, Reddit, LinkedIn, and Twitter) and express their feelings and thoughts regarding COVID-19. Twitter is a popular social networking platform, and it enables anyone to use tweets. This research used Twitter datasets to explore user sentiment from the COVID-19 perspective. We used a dataset of COVID-19 Twitter posts from nine states in the United States for fifteen days (from 1 April 2020, to 15 April 2020) to analyze user sentiment. We focus on exploiting machine learning (ML), and deep learning (DL) approaches to classify user sentiments regarding COVID-19. First, we labeled the dataset into three groups based on the sentiment values, namely positive, negative, and neutral, to train some popular ML algorithms and DL models to predict the user concern label on COVID-19. Additionally, we have compared traditional bag-of-words and term frequency-inverse document frequency (TF-IDF) for representing the text to numeric vectors in ML techniques. Furthermore, we have contrasted the encoding methodology and various word embedding schemes, such as the word to vector (Word2Vec) and global vectors for word representation (GloVe) versions, with three sets of dimensions (100, 200, and 300) for representing the text to numeric vectors for DL approaches. Finally, we compared COVID-19 infection cases and COVID-19-related tweets during the COVID-19 pandemic.
KW - COVID-19
KW - machine learning
KW - natural language processing
KW - neural network
KW - sentiment analysis
KW - tweets
UR - http://www.scopus.com/inward/record.url?scp=85132174037&partnerID=8YFLogxK
U2 - 10.3390/bdcc6020065
DO - 10.3390/bdcc6020065
M3 - Article
AN - SCOPUS:85132174037
SN - 2504-2289
VL - 6
JO - Big Data and Cognitive Computing
JF - Big Data and Cognitive Computing
IS - 2
M1 - 65
ER -