Design of an imputation methodology by random selection using regression trees
Abstract:
One of the biggest issues in the information collection stage is the absence of data, this research focuses specifically on the scenario when the loss is partial, completely random and the data is quantitative. There are classic techniques to impute data, however, these have not been able to accurately impute the real data. A design of an imputation methodology by random selection is proposed through the use of regression trees, comparing theoretically and empirically with and without the use of the tree for different data loss percentages. Unbiased estimators of variances and biases are obtained by evaluating their properties, which improves the estimates. As a disadvantage of the proposed design, it does not solve the alteration of the distribution of the data and the relationship between the variables.
Año de publicación:
2021
Keywords:
Fuente:

Tipo de documento:
Other
Estado:
Acceso abierto
Áreas de conocimiento:
- Aprendizaje automático
- Algoritmo
- Ciencias de la computación
Áreas temáticas:
- Funcionamiento de bibliotecas y archivos