TY - GEN
T1 - On strength adjustment for MCTS-based programs
AU - Wu, I. Chen
AU - Wu, Ti Rong
AU - Liu, An Jen
AU - Guei, Hung
AU - Wei, Tinghan
N1 - Publisher Copyright:
© 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2019/7
Y1 - 2019/7
N2 - This paper proposes an approach to strength adjustment for MCTS-based game-playing programs. In this approach, we use a softmax policy with a strength index to choose moves. Most importantly, we filter low quality moves by excluding those that have a lower simulation count than a pre-defined threshold ratio of the maximum simulation count. We perform a theoretical analysis, reaching the result that the adjusted policy is guaranteed to choose moves exceeding a lower bound in strength by using a threshold ratio. The approach is applied to the Go program ELF OpenGo. The experiment results show that is highly correlated to the empirical strength; namely, given a threshold ratio 0.1, is linearly related to the Elo rating with regression error 47.95 Elo where. Meanwhile, the covered strength range is about 800 Elo ratings in the interval of in. With the ease of strength adjustment using, we present two methods to adjust strength and predict opponents' strengths dynamically. To our knowledge, this result is state-of-the-art in terms of the range of strengths in Elo rating while maintaining a controllable relationship between the strength and a strength index.
AB - This paper proposes an approach to strength adjustment for MCTS-based game-playing programs. In this approach, we use a softmax policy with a strength index to choose moves. Most importantly, we filter low quality moves by excluding those that have a lower simulation count than a pre-defined threshold ratio of the maximum simulation count. We perform a theoretical analysis, reaching the result that the adjusted policy is guaranteed to choose moves exceeding a lower bound in strength by using a threshold ratio. The approach is applied to the Go program ELF OpenGo. The experiment results show that is highly correlated to the empirical strength; namely, given a threshold ratio 0.1, is linearly related to the Elo rating with regression error 47.95 Elo where. Meanwhile, the covered strength range is about 800 Elo ratings in the interval of in. With the ease of strength adjustment using, we present two methods to adjust strength and predict opponents' strengths dynamically. To our knowledge, this result is state-of-the-art in terms of the range of strengths in Elo rating while maintaining a controllable relationship between the strength and a strength index.
UR - http://www.scopus.com/inward/record.url?scp=85076752460&partnerID=8YFLogxK
U2 - 10.1609/aaai.v33i01.33011222
DO - 10.1609/aaai.v33i01.33011222
M3 - Conference contribution
AN - SCOPUS:85076752460
T3 - 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019
SP - 1222
EP - 1229
BT - 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019
PB - AAAI press
T2 - 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Annual Conference on Innovative Applications of Artificial Intelligence, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019
Y2 - 27 January 2019 through 1 February 2019
ER -