Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms

Lung Pin Chen, I. Chen Wu, Yen Ling Chang

研究成果: Conference contribution同行評審

1 引文 斯高帕斯(Scopus)

摘要

Due to high capacity and complex scheduling activities, a HPC platform often creates resource fragments with low usability. This paper develops a novel fragment-aware scheduling approach which improves system utilization by fitting elastic lightweight tasks to the fragments of resources dynamically. The new approach employs a threshold to determine the balancing factor between the length of tasks and the degree of granularity of the resource fragments. We employ the PPO reinforcement learning approach to train a neural network that can compute the threshold precisely. With the threshold that is adaptive to the changing system states, the PPO-based scheduler is able to utilize the idle resources and maximize the execution success rate of the tasks.

原文English
主出版物標題Proceedings - 2019 International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781728146669
DOIs
出版狀態Published - 11月 2019
事件24th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019 - Kaohsiung, Taiwan
持續時間: 21 11月 201923 11月 2019

出版系列

名字Proceedings - 2019 International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019

Conference

Conference24th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019
國家/地區Taiwan
城市Kaohsiung
期間21/11/1923/11/19

指紋

深入研究「Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms」主題。共同形成了獨特的指紋。

引用此