Missing data imputation in breast cancer prognosis


Abstract:

Missing data are often a problem present in real datasets and different imputation techniques are normally used to alleviate this problem. In this paper we analyze the performance of two different data imputation methods in a task where the aim is to pbkp_redict the probability of breast cancer relapse. Mean imputation and hot-deck methods were used to replace missing values found in a dataset containing 3679 records of patients. Artificial neural network models were trained with the standard dataset (containing no missing data but a restricted number of cases) and also with the data reconstructed by using the two imputation methods mentioned above. The results were analyzed in terms of the pbkp_redictive accuracy and also in terms of the calibration of the results.

Año de publicación:

2006

Keywords:

  • prognosis
  • Missing data imputation
  • Breast Cancer
  • artificial neural networks

Fuente:

scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Estadísticas
  • Análisis de datos

Áreas temáticas:

  • Enfermedades
  • Funcionamiento de bibliotecas y archivos