Missing data imputation in breast cancer prognosis
Abstract:
Missing data are often a problem present in real datasets and different imputation techniques are normally used to alleviate this problem. In this paper we analyze the performance of two different data imputation methods in a task where the aim is to pbkp_redict the probability of breast cancer relapse. Mean imputation and hot-deck methods were used to replace missing values found in a dataset containing 3679 records of patients. Artificial neural network models were trained with the standard dataset (containing no missing data but a restricted number of cases) and also with the data reconstructed by using the two imputation methods mentioned above. The results were analyzed in terms of the pbkp_redictive accuracy and also in terms of the calibration of the results.
Año de publicación:
2006
Keywords:
- prognosis
- Missing data imputation
- Breast Cancer
- artificial neural networks
Fuente:
Tipo de documento:
Conference Object
Estado:
Acceso restringido
Áreas de conocimiento:
- Estadísticas
- Análisis de datos
Áreas temáticas:
- Enfermedades
- Funcionamiento de bibliotecas y archivos