TY - GEN
T1 - Accuracy-Time Efficient Hyperparameter Optimization Using Actor-Critic-based Reinforcement Learning and Early Stopping in OpenAI Gym Environment
AU - Christian, Albert Budi
AU - Lin, Chih Yu
AU - Tseng, Yu Chee
AU - Van, Lan Da
AU - Hu, Wan Hsun
AU - Yu, Chia Hsuan
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In this paper, we present accuracy-time efficient hyperparameter optimization (HPO) using advantage actor-critic (A2C)-based reinforcement learning (RL) and early stopping in OpenAI Gym environment. The A2C RL can improve the hyperparameter selection such that the resulting accuracy of machine learning (ML) algorithms including XGBoost, support vector classifier (SVC), random forest shows comparable. According to the specified accuracy of the ML algorithms, the early stopping scheme can save the computation cost. Ten standard datasets are used to valid the accuracy-time efficient HPO. Experimental results show that the presented accuracy-efficient HPO architecture can improve 0.77% accuracy on average compared with default hyperparameter for random forest. The early stopping can save 64% computation cost on average compared to without early stopping for random forest.
AB - In this paper, we present accuracy-time efficient hyperparameter optimization (HPO) using advantage actor-critic (A2C)-based reinforcement learning (RL) and early stopping in OpenAI Gym environment. The A2C RL can improve the hyperparameter selection such that the resulting accuracy of machine learning (ML) algorithms including XGBoost, support vector classifier (SVC), random forest shows comparable. According to the specified accuracy of the ML algorithms, the early stopping scheme can save the computation cost. Ten standard datasets are used to valid the accuracy-time efficient HPO. Experimental results show that the presented accuracy-efficient HPO architecture can improve 0.77% accuracy on average compared with default hyperparameter for random forest. The early stopping can save 64% computation cost on average compared to without early stopping for random forest.
KW - Accuracy-time efficiency
KW - Actor-Critic
KW - early stopping
KW - Hyperparameter optimization
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85145968661&partnerID=8YFLogxK
U2 - 10.1109/IoTaIS56727.2022.9975984
DO - 10.1109/IoTaIS56727.2022.9975984
M3 - Conference contribution
AN - SCOPUS:85145968661
T3 - Proceedings of the 2022 IEEE International Conference on Internet of Things and Intelligence Systems, IoTaIS 2022
SP - 230
EP - 234
BT - Proceedings of the 2022 IEEE International Conference on Internet of Things and Intelligence Systems, IoTaIS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Internet of Things and Intelligence Systems, IoTaIS 2022
Y2 - 24 November 2022 through 26 November 2022
ER -