Unstructured grid applications on GPU: Performance analysis and improvement
Abstract:
Performance of applications running on GPUs is mainly affected by hardware occupancy and global memory latency. Scientific applications that rely on analysis using unstructured grids could benefit from the high performance capabilities provided by GPUs, however, its memory access pattern and algorithm limit the potential benefits. In this paper we analyze the algorithm for unstructured grid analysis on the basis of hardware occupancy and memory access efficiency. In general, the algorithm can be divided into three stages: cell-oriented analysis, edge-oriented analysis and information update, which present different memory access patterns. Based on the analysis we modify the algorithm to make it suitable for GPUs. The proposed algorithm aims for high hardware occupancy and efficient global memory access. Finally, through implementation we show that our design achieves up to 88 times speedup compared to the sequential CPU version. © 2011 ACM.
Año de publicación:
2011
Keywords:
- CUDA
- GPU
- GPGPU
- Unstructured grid
Fuente:
scopus
googleTipo de documento:
Conference Object
Estado:
Acceso restringido
Áreas de conocimiento:
- Ciencias de la computación
- Ciencias de la computación
Áreas temáticas de Dewey:
- Ciencias de la computación
- Ciencia y religión
- Física aplicada
Objetivos de Desarrollo Sostenible:
- ODS 9: Industria, innovación e infraestructura
- ODS 17: Alianzas para lograr los objetivos
- ODS 8: Trabajo decente y crecimiento económico