TY - GEN
T1 - Comparative Analysis of Machine Learning Algorithms for Hepatitis C Virus (HCV) Prediction
AU - Toffaha, Khaled M.
AU - Simsekler, Mecit Can Emre
AU - Omar, Mohammed Atif
N1 - Publisher Copyright:
© 2023 Computers and Industrial Engineering. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Hepatitis C is a prevalent disease, with an estimated 3-4 million new cases annually. This poses a significant public health concern, necessitating effective identification and treatment strategies. This study compares the performance of multi and binary-class labels using the same dataset, considering various evaluation metrics and tool comparisons. Furthermore, the study seeks to utilize Machine learning (ML) algorithms to identify key features in predicting the Hepatitis C Virus (HCV) using an Egyptian patient dataset. The results reveal that the Support Vector Machine (SVM) achieved the highest accuracy rate of 28.45% for the multi-class label. At the same time, the random forest attained an accuracy of 75.05% for the binary class label. Notably, the binary class performance outperformed the multi-class label. Additionally, multi-feature selection methods improved convergence speed and yielded better accuracies and precisions. Normalization and data scaling techniques also played a vital role in improving the results. Moreover, the study employed Bayesian Networks (BNs) and the Shapley Additive Explanations (SHAP) method to gain insights into the predictions made by the ML model. These techniques provided valuable explanations for the model's decisions, enhancing interpretability and aiding in understanding the factors driving the predictions. Overall, this study contributes to HCV prediction by comparing performance between different label types and exploring feature selection methods. The findings underscore the importance of accurate prediction and highlight the potential of advanced techniques such as BNs and SHAP for improved interpretability in ML models.
AB - Hepatitis C is a prevalent disease, with an estimated 3-4 million new cases annually. This poses a significant public health concern, necessitating effective identification and treatment strategies. This study compares the performance of multi and binary-class labels using the same dataset, considering various evaluation metrics and tool comparisons. Furthermore, the study seeks to utilize Machine learning (ML) algorithms to identify key features in predicting the Hepatitis C Virus (HCV) using an Egyptian patient dataset. The results reveal that the Support Vector Machine (SVM) achieved the highest accuracy rate of 28.45% for the multi-class label. At the same time, the random forest attained an accuracy of 75.05% for the binary class label. Notably, the binary class performance outperformed the multi-class label. Additionally, multi-feature selection methods improved convergence speed and yielded better accuracies and precisions. Normalization and data scaling techniques also played a vital role in improving the results. Moreover, the study employed Bayesian Networks (BNs) and the Shapley Additive Explanations (SHAP) method to gain insights into the predictions made by the ML model. These techniques provided valuable explanations for the model's decisions, enhancing interpretability and aiding in understanding the factors driving the predictions. Overall, this study contributes to HCV prediction by comparing performance between different label types and exploring feature selection methods. The findings underscore the importance of accurate prediction and highlight the potential of advanced techniques such as BNs and SHAP for improved interpretability in ML models.
KW - Bayesian Network
KW - Hepatitis C
KW - Machine Learning
KW - Machine Learning Prediction
UR - http://www.scopus.com/inward/record.url?scp=85184282609&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85184282609
T3 - Proceedings of International Conference on Computers and Industrial Engineering, CIE
SP - 1348
EP - 1358
BT - 50th International Conference on Computers and Industrial Engineering, CIE 2023
A2 - Dessouky, Yasser
A2 - Shamayleh, Abdulrahim
T2 - 50th International Conference on Computers and Industrial Engineering: Sustainable Digital Transformation, CIE 2023
Y2 - 30 October 2023 through 2 November 2023
ER -