TY - JOUR
T1 - Adaptive Machine Learning-Based Proactive Thermal Management for NoC Systems
AU - Chen, Kun Chih
AU - Liao, Yuan Hao
AU - Chen, Cheng Ting
AU - Wang, Lei Qi
N1 - Publisher Copyright:
© 1993-2012 IEEE.
PY - 2023/8/1
Y1 - 2023/8/1
N2 - Because of the high-complex interconnection in contemporary multicore systems, the network-on-chip (NoC) technology has been proven as an efficient way to solve the communication problem in multicore systems. However, the thermal problem becomes the main design challenge in the current NoC systems due to the high-diverse workload distribution and large power density. Therefore, proactive dynamic thermal management (PDTM) is employed as an efficient way to control the system temperature. Based on the predicted temperature information, the PDTM can control the system temperature in advance to reduce the performance impact during the temperature control period. However, conventional temperature prediction models are usually built based on specific physical parameters, which are usually temperature-sensitive. Consequently, the current temperature prediction models still result in significant temperature prediction errors. To solve this problem, a novel adaptive machine learning (ML)-based PDTM is proposed in this work. The adaptive ML-based PDTM first uses an adaptive single layer perceptron (ASLP), which is composed of a single-neuron operation and a least mean square (LMS) adaptive filter technology, to precisely predict the future temperature. Afterward, the proposed adaptive reinforcement learning (RL) is used to find the proper throttling ratio to control the system temperature. In this way, the proposed adaptive ML-based PDTM can adapt to the hyperplane of the temperature behavior of the NoC system and provide a proper temperature control strategy at runtime. Compared with related works, the proposed approach reduces average temperature prediction error by 0.2%-78.0% and improves the system performance by 2.4%-43.0% with smaller hardware overhead.
AB - Because of the high-complex interconnection in contemporary multicore systems, the network-on-chip (NoC) technology has been proven as an efficient way to solve the communication problem in multicore systems. However, the thermal problem becomes the main design challenge in the current NoC systems due to the high-diverse workload distribution and large power density. Therefore, proactive dynamic thermal management (PDTM) is employed as an efficient way to control the system temperature. Based on the predicted temperature information, the PDTM can control the system temperature in advance to reduce the performance impact during the temperature control period. However, conventional temperature prediction models are usually built based on specific physical parameters, which are usually temperature-sensitive. Consequently, the current temperature prediction models still result in significant temperature prediction errors. To solve this problem, a novel adaptive machine learning (ML)-based PDTM is proposed in this work. The adaptive ML-based PDTM first uses an adaptive single layer perceptron (ASLP), which is composed of a single-neuron operation and a least mean square (LMS) adaptive filter technology, to precisely predict the future temperature. Afterward, the proposed adaptive reinforcement learning (RL) is used to find the proper throttling ratio to control the system temperature. In this way, the proposed adaptive ML-based PDTM can adapt to the hyperplane of the temperature behavior of the NoC system and provide a proper temperature control strategy at runtime. Compared with related works, the proposed approach reduces average temperature prediction error by 0.2%-78.0% and improves the system performance by 2.4%-43.0% with smaller hardware overhead.
KW - Machine learning (ML)
KW - network-on-chip (NoC)
KW - neural network
KW - reinforcement learning (RL)
KW - temperature prediction
KW - thermal management
UR - http://www.scopus.com/inward/record.url?scp=85162621829&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2023.3282969
DO - 10.1109/TVLSI.2023.3282969
M3 - Article
AN - SCOPUS:85162621829
SN - 1063-8210
VL - 31
SP - 1114
EP - 1127
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 8
ER -