A Novel Approach to Detect Missing Values Patterns in Time Series Data


Abstract:

The increase of environmental sensors to capture the behavior of cities implies large amounts of shared data. However, missing values issues are unavoidable, becoming it a critical problem for studies which require data analysis over extensive periods. The main problem is evident in longitudinal studies since they require data over long periods. Hence, a convenient process is to support the data collection rules by determining the behavior of common missing data slots. This process is possible by discovering missing data patterns over time series based on: (1) Data matrices definition, (2) Compute and categorize the missed periods using the proposed algorithm, (3) Identify the time analysis scenarios, and (4) Applying the Kernel Density Estimation algorithm. This paper describes the experimentation of this method using a real air quality dataset from Cuenca, Ecuador, collected over one-year. The results show that the proposed approach is useful to evidence the missing data patterns. Also, this approach provides a good starting point for companies and laboratories interested in improving their data collection rules.

Año de publicación:

2020

Keywords:

  • Missing values patterns
  • Kernel Density Estimation
  • Compute missing values

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Análisis de datos
  • Ciencias de la computación
  • Inferencia estadística

Áreas temáticas:

  • Métodos informáticos especiales
  • Programación informática, programas, datos, seguridad
  • Funcionamiento de bibliotecas y archivos