Partial Flattening: A Compilation Technique for Irregular Nested Parallelism on GPGPUs

Ming Hsiang Huang, Wuu Yang

研究成果: Conference contribution同行評審

3 引文 斯高帕斯(Scopus)

摘要

Supporting irregular nested parallelism on modern GPUs requires much effort. One should distribute the parallel tasks evenly while preserving reasonable memory usage. Moreover, the task distribution should also fit the thread hierarchy of the underlying GPU to fully exploit its computing power. We propose partial flattening, an automatic code transformation which translates annotated C programs to CUDA kernels. Thread blocks are treated as flat SIMT processors. Iterations are dynamically organized into batches. Batches are executed in a sequential (depth-first) order. A kernel is treated as multiple independent SIMT processors with an additional task-stealing mechanism. Partial flattening allows easy expression of nested parallelism and synchronization by annotating nested parallel loops or parallel-recursive calls, while preserving reasonable memory usage by the depth-first execution order. Our 2-level task distribution scheme does not need special hardware support, and fits well with the CUDA thread hierarchy. Experiments show that partial flattening outperforms NESL significantly in most benchmarks, and obtains 2.15x and 67x speedup over CUDA dynamic parallelism in Quicksort and the Bron-Kerbosch algorithm, respectively.

原文English
主出版物標題Proceedings - 45th International Conference on Parallel Processing, ICPP 2016
發行者Institute of Electrical and Electronics Engineers Inc.
頁面552-561
頁數10
ISBN(電子)9781509028238
DOIs
出版狀態Published - 21 9月 2016
事件45th International Conference on Parallel Processing, ICPP 2016 - Philadelphia, 美國
持續時間: 16 8月 201619 8月 2016

出版系列

名字Proceedings of the International Conference on Parallel Processing
2016-September
ISSN(列印)0190-3918

Conference

Conference45th International Conference on Parallel Processing, ICPP 2016
國家/地區美國
城市Philadelphia
期間16/08/1619/08/16

指紋

深入研究「Partial Flattening: A Compilation Technique for Irregular Nested Parallelism on GPGPUs」主題。共同形成了獨特的指紋。

引用此