TY - GEN
T1 - WER
T2 - 29th Asia and South Pacific Design Automation Conference, ASP-DAC 2024
AU - Huang, En Ming
AU - Cheng, Bo Wun
AU - Lin, Meng Hsien
AU - Lee, Chun Yi
AU - Yeh, Tsung Tai
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Irregular graphs are becoming increasingly prevalent across a broad spectrum of data analysis applications. Despite their versatility, the inherent complexity and irregularity of these graphs often result in the underutilization of Single Instruction, Multiple Data (SIMD) resources when processed on Graphics Processing Units (GPUs). This underutilization originates from two primary issues: the occurrence of inactive threads and intra-warp load imbalances. These issues can produce idle threads, lead to inefficient usage of SIMD resources, consequently hamper throughput, and increase program execution time. To address these challenges, we introduce Warp EqualizeR (WER), a framework designed to optimize the utilization of SIMD resources on a GPU for processing irregular graphs. WER employs both software API and a specifically-tailored hardware microarchitecture. Such a synergistic approach enables workload redistribution in irregular graphs, which allows WER to enhance SIMD lane utilization and further harness the SIMD resources within a GPU. Our experimental results over seven different graph applications indicate that WER yields a geometric mean speedup of 2.52 × and 1.47 × over the baseline GPU and existing state-of-the-art methodologies, respectively.
AB - Irregular graphs are becoming increasingly prevalent across a broad spectrum of data analysis applications. Despite their versatility, the inherent complexity and irregularity of these graphs often result in the underutilization of Single Instruction, Multiple Data (SIMD) resources when processed on Graphics Processing Units (GPUs). This underutilization originates from two primary issues: the occurrence of inactive threads and intra-warp load imbalances. These issues can produce idle threads, lead to inefficient usage of SIMD resources, consequently hamper throughput, and increase program execution time. To address these challenges, we introduce Warp EqualizeR (WER), a framework designed to optimize the utilization of SIMD resources on a GPU for processing irregular graphs. WER employs both software API and a specifically-tailored hardware microarchitecture. Such a synergistic approach enables workload redistribution in irregular graphs, which allows WER to enhance SIMD lane utilization and further harness the SIMD resources within a GPU. Our experimental results over seven different graph applications indicate that WER yields a geometric mean speedup of 2.52 × and 1.47 × over the baseline GPU and existing state-of-the-art methodologies, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85189372343&partnerID=8YFLogxK
U2 - 10.1109/ASP-DAC58780.2024.10473955
DO - 10.1109/ASP-DAC58780.2024.10473955
M3 - Conference contribution
AN - SCOPUS:85189372343
T3 - Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
SP - 201
EP - 206
BT - ASP-DAC 2024 - 29th Asia and South Pacific Design Automation Conference, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 January 2024 through 25 January 2024
ER -