A Comparison of Machine Learning Algorithms to Predict Cervical Cancer on Imbalanced Data


Abstract:

Cervical cancer is a leading cause of death in women. The present research analyzes, explores, compares and identifies the best method for predicting cervical cancer by applying machine learning techniques. The data is from the University Hospital of Caracas, Venezuela where a selection of variables was made according to the literature in order to predict cervical cancer. Seven algorithms were applied: decision tree (DT), random forest (RF), logistic regression (LR), XGBoost (XG), Naive Bayes (NB), multilayer perceptron (MLP) and K-nearest neighbors (KNN). Furthermore, three imbalanced data techniques were applied: SMOTETomek, SMOTE, and ROS for Hinselmann, Schiller, Cytology and Biopsy as target variables. In addition, accuracy, precision, recall, f-score and AUC were used to evaluate the results. Random forest was the algorithm with the highest results in accuracy, precision and f-score, with 94.57%, 72.46% and 60.70% respectively. Logistic regression and Naive Bayes had the highest values for recall and AUC with 68.37% and 79.11% respectively.

Año de publicación:

2023

Keywords:

  • Machine learning
  • pbkp_rediction
  • Imbalanced data techniques
  • Cervical Cancer

Fuente:

scopusscopus
googlegoogle

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Aprendizaje automático

Áreas temáticas de Dewey:

  • Ciencias de la computación
  • Tecnología (Ciencias aplicadas)
  • Medicina y salud
Procesado con IAProcesado con IA

Objetivos de Desarrollo Sostenible:

  • ODS 3: Salud y bienestar
  • ODS 17: Alianzas para lograr los objetivos
  • ODS 9: Industria, innovación e infraestructura
Procesado con IAProcesado con IA