TY - JOUR
T1 - Semi-strong efficient market of Bitcoin and Twitter
T2 - An analysis of semantic vector spaces of extracted keywords and light gradient boosting machine models
AU - Wang, Fang
AU - Gacesa, Marko
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/7
Y1 - 2023/7
N2 - This study extends the examination of the Efficient-Market Hypothesis in Bitcoin market during a five-year fluctuation period, from September 1 2017 to September 1 2022, by analyzing 28,739,514 qualified tweets containing the targeted topic “Bitcoin”. Unlike previous studies, we extracted fundamental keywords as an informative proxy for carrying out the study of the EMH in the Bitcoin market rather than focusing on sentiment analysis, information volume, or price data. We tested market efficiency in hourly, 4-hourly, and daily time periods to understand the speed and accuracy of market reactions towards the information within different thresholds. A sequence of machine learning methods and textual analyses were used, including measurements of distances of semantic vector spaces of information, keywords extraction and encoding model, and Light Gradient Boosting Machine (LGBM) classifiers. Our results suggest that 78.06% (83.08%), 84.63% (87.77%), and 94.03% (94.60%) of hourly, 4-hourly, and daily bullish (bearish) market movements can be attributed to public information within organic tweets.
AB - This study extends the examination of the Efficient-Market Hypothesis in Bitcoin market during a five-year fluctuation period, from September 1 2017 to September 1 2022, by analyzing 28,739,514 qualified tweets containing the targeted topic “Bitcoin”. Unlike previous studies, we extracted fundamental keywords as an informative proxy for carrying out the study of the EMH in the Bitcoin market rather than focusing on sentiment analysis, information volume, or price data. We tested market efficiency in hourly, 4-hourly, and daily time periods to understand the speed and accuracy of market reactions towards the information within different thresholds. A sequence of machine learning methods and textual analyses were used, including measurements of distances of semantic vector spaces of information, keywords extraction and encoding model, and Light Gradient Boosting Machine (LGBM) classifiers. Our results suggest that 78.06% (83.08%), 84.63% (87.77%), and 94.03% (94.60%) of hourly, 4-hourly, and daily bullish (bearish) market movements can be attributed to public information within organic tweets.
KW - Bitcoin
KW - Efficient-market hypothesis
KW - GloVe semantic vector spaces
KW - LightGBM
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85160851071&partnerID=8YFLogxK
U2 - 10.1016/j.irfa.2023.102692
DO - 10.1016/j.irfa.2023.102692
M3 - Article
AN - SCOPUS:85160851071
SN - 1057-5219
VL - 88
JO - International Review of Financial Analysis
JF - International Review of Financial Analysis
M1 - 102692
ER -