Abstract
A substantial portion of the water supply and sanitation (WatSan) infrastructure in the rural areas of developing countries is currently not operating. This failure is due to the inappropriate implementation of WatSan technologies and the lack of decision-making resources. This study explores the application of several machine learning classification algorithms to predict the optimal WatSan system effectively. The proposed classification methods are Logistic Regression, Random Forest, Support Vector Machine, CatBoost, and Neural Network. The practicality of these classification methods was tested using a dataset comprising 774 water technology options. Several experiments were conducted to obtain the highest possible classification accuracy of the capacity requirement level (CRL) in terms of accuracy and F1 score classification metrics. Our findings suggest that CatBoost, with the addition of the synthetic minority oversampling technique (SMOTE), outperforms the other algorithms in classifying WatSan technology options.
Original language | British English |
---|---|
Article number | 2829 |
Journal | Water (Switzerland) |
Volume | 15 |
Issue number | 15 |
DOIs | |
State | Published - Aug 2023 |
Keywords
- classification
- decision support system
- Logistic Regression
- machine learning
- Random Forest
- Support Vector Machine