Regresar

On the use of conventional and statistical-learning techniques for the analysis of PISA results in Spain

Abstract:

A simple and general feature extraction procedure is presented which provides robust nonparametric estimates on the statistical relevance of data features, by computing the confidence intervals for the model weights in the case of linear models, and for the change in the error rate when removing each feature in the case of nonlinear models. The method performance is specially scrutinized for the pbkp_rediction of the 2009 PISA scores of the Spanish students. We compare the ability of logistic regression, Fisher linear discriminant analysis, and Support Vector Machine (SVM, both with linear and with nonlinear kernel), to classify top performers in the mathematics exam. All the methods yield similar accuracy, with linear and nonlinear SVM providing improved feature reduction capabilities, at the expense of computational complexity. The results show relevant relationships of the success rate with regional variables, computer availability, gender, immigration status, learning strategies, and some others. The proposed feature selection procedure for machine learning classification can be readily used in other fields, and it can be improved with further theoretical and probabilistic development.

Año de publicación:

2016

Keywords:

SUPPORT VECTOR MACHINES
Spanish students
feature selection
bootstrap resampling
Fisher's discriminant
PISA report

Fuente:

scopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

Áreas temáticas:

Educación
Escuelas y sus actividades; educación especial
Ciencias de la computación

Contribuidores:

Jose Luis Rojo-Álvarez

Gorostiaga A.