Unstructured grid applications on GPU: Performance analysis and improvement
Abstract:
Performance of applications running on GPUs is mainly affected by hardware occupancy and global memory latency. Scientific applications that rely on analysis using unstructured grids could benefit from the high performance capabilities provided by GPUs, however, its memory access pattern and algorithm limit the potential benefits. In this paper we analyze the algorithm for unstructured grid analysis on the basis of hardware occupancy and memory access efficiency. In general, the algorithm can be divided into three stages: cell-oriented analysis, edge-oriented analysis and information update, which present different memory access patterns. Based on the analysis we modify the algorithm to make it suitable for GPUs. The proposed algorithm aims for high hardware occupancy and efficient global memory access. Finally, through implementation we show that our design achieves up to 88 times speedup compared to the sequential CPU version. © 2011 ACM.
Año de publicación:
2011
Keywords:
- CUDA
- GPU
- GPGPU
- Unstructured grid
Fuente:
Tipo de documento:
Conference Object
Estado:
Acceso restringido
Áreas de conocimiento:
- Ciencias de la computación
- Ciencias de la computación
Áreas temáticas:
- Ciencias de la computación
- Ciencia y religión
- Física aplicada