Potential model overfitting in pbkp_redicting soil carbon content by visible and near-infrared spectroscopy
Abstract:
Soil spectroscopy is known as a rapid and cost-effective method for pbkp_redicting soil properties from spectral data. The objective of this work was to build a statistical model to pbkp_redict soil carbon content from spectral data by partial least squares regression using a limited number of soil samples. Soil samples were collected from two soil orders (Andisol and Ultisol), where the dominant land cover is native Nothofagus forest. Total carbon was analyzed in the laboratory and samples were scanned using a spectroradiometer. We found evidence that the reflectance was influenced by soil carbon content, which is consistent with the literature. However, the reflectance was not useful for building an appropriate regression model. Thus, we report here intriguing results obtained in the calibration process that can be confusing and misinterpreted. For instance, using the Savitzky-Golay filter for pre-processing spectral data, we obtained R2 = 0.82 and root-mean-squared error (RMSE) = 0.61% in model calibration. However, despite these values being comparable with those of other similar studies, in the cross-validation procedure, the data showed an unusual behavior that leads to the conclusion that the model overfits the data. This indicates that the model should not be used on unobserved data.
Año de publicación:
2017
Keywords:
- Spectral diffuse reflectance
- Chemometrics
- Cross-Validation
- partial least squares regression
- SOC
Fuente:
Tipo de documento:
Article
Estado:
Acceso abierto
Áreas de conocimiento:
- Fertilidad del suelo
- Aprendizaje automático
- Optimización matemática
Áreas temáticas:
- Técnicas, equipos y materiales