EDA and a Tailored Data Imputation Algorithm for Daily Ozone Concentrations


Abstract:

Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.

Año de publicación:

2019

Keywords:

  • air pollution
  • Data imputation
  • Gaussian Process
  • sensor data

Fuente:

scopusscopus
googlegoogle

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Estadísticas
  • Análisis de datos
  • Algoritmo

Áreas temáticas:

  • Ciencias de la computación