Net2Net Extension for the AlphaGo Zero Algorithm

Hsiao Chung Hsieh, Ti Rong Wu, Ting Han Wei, I. Chen Wu*


研究成果: Conference contribution同行評審


The number of residual network blocks in a computer Go program following the AlphaGo Zero algorithm is one of the key factors to the program’s playing strength. In this paper, we propose a method to deepen the residual network without reducing performance. Next, as self-play tends to be the most time-consuming part of AlphaGo Zero training, we demonstrate how it is possible to continue training on this deepened residual network using the self-play records generated by the original network (for time saving). The deepening process is performed by inserting new layers into the original network. We present in this paper three insertion schemes based on the concept behind Net2Net. Lastly, of the many different ways to sample the previously generated self-play records, we propose two methods so that the deepened network can continue the training process. In our experiment on the extension from 20 residual blocks to 40 residual blocks for 9 × 9 Go, the results show that the best performing extension scheme is able to obtain 61.69% win rate against the unextended player (20 blocks) while greatly saving the time for self-play.

主出版物標題Advances in Computer Games - 16th International Conference, ACG 2019, Revised Selected Papers
編輯Tristan Cazenave, Jaap van den Herik, Abdallah Saffidine, I-Chen Wu
發行者Springer Science and Business Media Deutschland GmbH
出版狀態Published - 2020
事件16th International Conference on Advances in Computer Games, ACG 2019 - Macao, China
持續時間: 11 8月 201913 8月 2019


名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
12516 LNCS


Conference16th International Conference on Advances in Computer Games, ACG 2019


深入研究「Net2Net Extension for the AlphaGo Zero Algorithm」主題。共同形成了獨特的指紋。