Methodology for Detecting Suspicious Claims in Health Insurance Using Supervised Machine Learning
Abstract:
Health insurance fraud (HIF) places a substantial economic burden on global health systems. While supervised machine learning (SML) offers a promising solution for its detection, most approaches are ad hoc and lack a systematic methodological framework that ensures replicability, adaptability, and effectiveness, especially in contexts with severe class imbalance. We developed PDHIF (Phases for Detecting Fraud in Health Insurance), a six-phase systematic methodology that introduces a holistic focus that integrates fraud theory, actors, manifestations, and factors with the complete SML lifecycle. We applied this methodology in a case study using a dataset of 8.5 million claims from a public health insurance system in Peru. We trained and evaluated three SML models (Random Forest, XGBoost, and multilayer perceptron) in two experimental scenarios: one with the original, highly unbalanced dataset and another with a training set balanced via the K-means SMOTE technique. When PDHIF was applied, the results revealed a stark contrast: in the unbalanced scenario, the models were ineffective at detecting fraud (F1 score < 0.521) despite high accuracy (>98%). In the balanced scenario, the performance improved dramatically. The best-performing model, RF, achieved an F1 score of 0.994, a sensitivity of 0.994, and an AUC of 0.994 on the test set, demonstrating a robust ability to distinguish suspicious claims.
Año de publicación:
2025
Keywords:
- Class imbalance
- Fraud detection
- health insurance fraud
- Machine Learning
- Peru
- systematic methodology
Fuente:
scopusTipo de documento:
Article
Estado:
Acceso restringido
Áreas de conocimiento:
- Aprendizaje automático
- Seguro
- Cuidado de la salud
Áreas temáticas de Dewey:
- Métodos informáticos especiales
- Seguros
- Medicina y salud
Objetivos de Desarrollo Sostenible:
- ODS 16: Paz, justicia e instituciones sólidas
- ODS 17: Alianzas para lograr los objetivos
- ODS 8: Trabajo decente y crecimiento económico