Insurance Risk Prediction Using Machine Learning

Rahul Sahai, Ali Al-Ataby, Sulaf Assi, Manoj Jayabalan, Panagiotis Liatsis, Chong Kim Loy, Abdullah Al-Hamid, Sahar Al-Sudani, Maitham Alamran, Hoshang Kolivand

    Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

    3 Scopus citations

    Abstract

    Underwriting decisions by insurance companies make a significant contribution to their profitability. Machine Learning (ML) techniques in underwriting decision making have saved time and improved operational efficiencies. A user-friendly cause-and-effect explanation of model’s predictions is useful to stakeholders, financial institutions and regulators. This research performed comparative analysis between tree-based classifiers such as Decision Tree, Random Forest and XGBoost. The study focused on enhancing risk assessment capabilities for life insurance companies using predictive analytics by classifying the insurance risk based on the historical data and propose the appropriate model to assess risk. Its purpose also included incorporating mechanisms that can aid in user friendly interpretation of ML models. Of all the models created as part of this research, the XGBoost classifier performed the best when compared to other classifiers, with an AUC value of 0.86 and F1-score above 0.56 on the validation set. The Random Forest classifier got AUC value of.84 and f1 score of.53 on the validation dataset. The results indicate the importance and advantages of tree -based models. These models i.e., XGBoost, decision tree and random forest are one of the best alternate techniques after the advent and popularity of the new age techniques in the machine learning such as neural networks, deep learning etc. The research also provides an insight on the interpretability of these conventional techniques by way of ‘SHAP’ or shapley values and ‘Feature Importance’ or ‘Variable Importance’. SHAP was used on complex models such as XGBoost and neural networks whereas Feature Importance is used in supervised learning methods such as Logistic Regression and tree- based models such as Decision Tree and Random Forest. Overall, the study was able to propose XGBoost as the most accurate model for Insurance risk classification and predictions.

    Original languageBritish English
    Title of host publicationLecture Notes on Data Engineering and Communications Technologies
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages419-433
    Number of pages15
    DOIs
    StatePublished - 2023

    Publication series

    NameLecture Notes on Data Engineering and Communications Technologies
    Volume165
    ISSN (Print)2367-4512
    ISSN (Electronic)2367-4520

    Keywords

    • Insurance
    • Machine learning
    • Neural networks
    • Random forest
    • Risk prediction
    • Support vector machine

    Fingerprint

    Dive into the research topics of 'Insurance Risk Prediction Using Machine Learning'. Together they form a unique fingerprint.

    Cite this