MCLB: Dynamic Load Balancing and Implications on GPU Memory Controllers

Vahid Geraeinejad*, Kun Chih Jimmy Chen, Zhonghai Lu, Masoumeh Ebrahimi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Graphics Processing Units (GPUs) play a pivotal role as primary devices for executing a diverse range of applications. Effective load balancing of the interconnection network is crucial in distributed computing systems as it ensures optimal resource utilization. While previous studies have addressed interconnection network load balancing, our investigation reveals that GPU cores often exhibit a uniform load pattern due to the nature of their workloads. However, we found that memory controllers experience varying loads, potentially leading to stall cycles during which memory requests cannot enter a specific controller's full queue, causing it to remain in the interconnection network. Introducing the concept of 'busy' and 'relaxed' memory controllers, our proposed method, Memory Controller Load Balancing (MCLB), dynamically balances the load on memory controllers by categorizing them based on a predefined threshold. GPU cores temporarily pause sending memory requests to 'busy' memory controllers, prioritizing 'relaxed' cores. This strategy effectively reduces unnecessary congestion in the interconnection network and improves resource utilization in the memory request path. To our knowledge, MCLB is the first method specifically designed to balance memory controller loads in GPU. MCLB significantly reduces total number of memory controller stalls (eliminating them completely in some cases), resulting in latency enhancements. It improves memory request and response roundtrip latency by up to 11.8%, and interconnection network latency by up to 24.6%. This work presents a novel approach to GPU optimization by addressing memory controller load imbalances.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 37th International System-on-Chip Conference, SOCC 2024
EditorsDiana Gohringer, Uwe Gabler, Tanja Harbaum, Klaus Hofmann
PublisherIEEE Computer Society
ISBN (Electronic)9798350377569
DOIs
StatePublished - 2024
Event37th IEEE International System-on-Chip Conference, SOCC 2024 - Dresden, Germany
Duration: 16 Sep 202419 Sep 2024

Publication series

NameInternational System on Chip Conference
ISSN (Print)2164-1676
ISSN (Electronic)2164-1706

Conference

Conference37th IEEE International System-on-Chip Conference, SOCC 2024
Country/TerritoryGermany
CityDresden
Period16/09/2419/09/24

Keywords

  • GPGPU
  • interconnection network
  • latency
  • load balancing
  • memory controller
  • stall

Fingerprint

Dive into the research topics of 'MCLB: Dynamic Load Balancing and Implications on GPU Memory Controllers'. Together they form a unique fingerprint.

Cite this