Effects of data aggregation on time series analysis of seasonal infections


Abstract:

Time series analysis in epidemiological studies is typically conducted on aggregated counts, although data tend to be collected at finer temporal resolutions. The decision to aggregate data is rarely discussed in epidemiological literature although it has been shown to impact model results. We present a critical thinking process for making decisions about data aggregation in time series analysis of seasonal infections. We systematically build a harmonic regression model to characterize peak timing and amplitude of three respiratory and enteric infections that have different seasonal patterns and incidence. We show that irregularities introduced when aggregating data must be controlled during modeling to prevent erroneous results. Aggregation irregularities had a minimal impact on the estimates of trend, amplitude, and peak timing for daily and weekly data regardless of the disease. However, estimates of peak timing of the more common infections changed by as much as 2.5 months when controlling for monthly data irregularities. Building a systematic model that controls for data irregularities is essential to accurately characterize temporal patterns of infections. With the urgent need to characterize temporal patterns of novel infections, such as COVID-19, this tutorial is timely and highly valuable for experts in many disciplines.

Año de publicación:

2020

Keywords:

  • Data aggregation
  • Infectious diseases
  • seasonality
  • Harmonic regression
  • TIME SERIES

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso abierto

Áreas de conocimiento:

  • Análisis de datos
  • Epidemiología

Áreas temáticas:

  • Medicina forense; incidencia de enfermedades
  • Programación informática, programas, datos, seguridad
  • Problemas sociales y servicios a grupos