TY - JOUR
T1 - Enhancing Medicare Fraud Detection with a CNN-Transformer-XGBoost Framework and Explainable AI
AU - Sakil, Mohammad Balayet Hossain
AU - Hasan, Md Amit
AU - Mozumder, Md Shahin Alam
AU - Hasan, Md Rokibul
AU - Opee, Shafiul Ajam
AU - Mridha, M. F.
AU - Aung, Zeyar
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Healthcare fraud is a critical challenge, contributing significantly to rising healthcare costs and financial losses. This article proposes a hybrid architecture for healthcare fraud detection, combining deep learning-based feature representation with gradient boosting classification and explainable AI techniques. The framework integrates convolutional neural networks (CNNs), transformers, and XGBoost to capture intricate patterns in claims data while maintaining interpretability through Shapley additive explanations. The model we proposed was tested on two datasets: the Medicare Provider Fraud dataset and the Healthcare Providers dataset. On the Medicare dataset, the framework achieved an F1-score of 0.95 on the training set and 0.92 on the test set, with an AUC-ROC of 0.98 and 0.97, respectively, outperforming state-of-the-art models such as LightGBM and CatBoost. On the Healthcare Providers dataset, the framework attained a test F1-score of 0.92 and an AUC-ROC of 0.96, consistently surpassing traditional models like Support Vector Machines and Random Forest. Key contributions include integrating domain-specific features, such as provider-patient interaction graphs and temporal patterns, and using explainability techniques to enhance trustworthiness. Furthermore, the framework demonstrated computational efficiency, with a training time of 150 seconds on the primary dataset, making it suitable for real-world deployment.
AB - Healthcare fraud is a critical challenge, contributing significantly to rising healthcare costs and financial losses. This article proposes a hybrid architecture for healthcare fraud detection, combining deep learning-based feature representation with gradient boosting classification and explainable AI techniques. The framework integrates convolutional neural networks (CNNs), transformers, and XGBoost to capture intricate patterns in claims data while maintaining interpretability through Shapley additive explanations. The model we proposed was tested on two datasets: the Medicare Provider Fraud dataset and the Healthcare Providers dataset. On the Medicare dataset, the framework achieved an F1-score of 0.95 on the training set and 0.92 on the test set, with an AUC-ROC of 0.98 and 0.97, respectively, outperforming state-of-the-art models such as LightGBM and CatBoost. On the Healthcare Providers dataset, the framework attained a test F1-score of 0.92 and an AUC-ROC of 0.96, consistently surpassing traditional models like Support Vector Machines and Random Forest. Key contributions include integrating domain-specific features, such as provider-patient interaction graphs and temporal patterns, and using explainability techniques to enhance trustworthiness. Furthermore, the framework demonstrated computational efficiency, with a training time of 150 seconds on the primary dataset, making it suitable for real-world deployment.
KW - convolutional neural networks
KW - Healthcare fraud detection
KW - hybrid deep learning
KW - Medicare fraud
KW - SHAP explainability
KW - Transformers
KW - XGBoost
UR - https://www.scopus.com/pages/publications/105003440419
U2 - 10.1109/ACCESS.2025.3562577
DO - 10.1109/ACCESS.2025.3562577
M3 - Article
AN - SCOPUS:105003440419
SN - 2169-3536
JO - IEEE Access
JF - IEEE Access
ER -