Unstructured grid applications on GPU: Performance analysis and improvement


Abstract:

Performance of applications running on GPUs is mainly affected by hardware occupancy and global memory latency. Scientific applications that rely on analysis using unstructured grids could benefit from the high performance capabilities provided by GPUs, however, its memory access pattern and algorithm limit the potential benefits. In this paper we analyze the algorithm for unstructured grid analysis on the basis of hardware occupancy and memory access efficiency. In general, the algorithm can be divided into three stages: cell-oriented analysis, edge-oriented analysis and information update, which present different memory access patterns. Based on the analysis we modify the algorithm to make it suitable for GPUs. The proposed algorithm aims for high hardware occupancy and efficient global memory access. Finally, through implementation we show that our design achieves up to 88 times speedup compared to the sequential CPU version. © 2011 ACM.

Año de publicación:

2011

Keywords:

  • CUDA
  • GPU
  • GPGPU
  • Unstructured grid

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Ciencias de la computación
  • Ciencias de la computación

Áreas temáticas:

  • Ciencias de la computación
  • Ciencia y religión
  • Física aplicada