Adaptive Machine Learning-Based Proactive Thermal Management for NoC Systems

Kun Chih Chen*, Yuan Hao Liao, Cheng Ting Chen, Lei Qi Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Because of the high-complex interconnection in contemporary multicore systems, the network-on-chip (NoC) technology has been proven as an efficient way to solve the communication problem in multicore systems. However, the thermal problem becomes the main design challenge in the current NoC systems due to the high-diverse workload distribution and large power density. Therefore, proactive dynamic thermal management (PDTM) is employed as an efficient way to control the system temperature. Based on the predicted temperature information, the PDTM can control the system temperature in advance to reduce the performance impact during the temperature control period. However, conventional temperature prediction models are usually built based on specific physical parameters, which are usually temperature-sensitive. Consequently, the current temperature prediction models still result in significant temperature prediction errors. To solve this problem, a novel adaptive machine learning (ML)-based PDTM is proposed in this work. The adaptive ML-based PDTM first uses an adaptive single layer perceptron (ASLP), which is composed of a single-neuron operation and a least mean square (LMS) adaptive filter technology, to precisely predict the future temperature. Afterward, the proposed adaptive reinforcement learning (RL) is used to find the proper throttling ratio to control the system temperature. In this way, the proposed adaptive ML-based PDTM can adapt to the hyperplane of the temperature behavior of the NoC system and provide a proper temperature control strategy at runtime. Compared with related works, the proposed approach reduces average temperature prediction error by 0.2%-78.0% and improves the system performance by 2.4%-43.0% with smaller hardware overhead.

Original languageEnglish
Pages (from-to)1114-1127
Number of pages14
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume31
Issue number8
DOIs
StatePublished - 1 Aug 2023

Keywords

  • Machine learning (ML)
  • network-on-chip (NoC)
  • neural network
  • reinforcement learning (RL)
  • temperature prediction
  • thermal management

Fingerprint

Dive into the research topics of 'Adaptive Machine Learning-Based Proactive Thermal Management for NoC Systems'. Together they form a unique fingerprint.

Cite this