TY - GEN
T1 - AlphaZero for a Non-Deterministic Game
AU - Hsueh, Chu Hsuan
AU - Wu, I-Chen
AU - Chen, Jr Chang
AU - Hsu, Tsan Sheng
PY - 2018/12/24
Y1 - 2018/12/24
N2 - The AlphaZero algorithm, developed by DeepMind, achieved superhuman levels of play in the games of chess, shogi, and Go, by learning without domain-specific knowledge except game rules. This paper investigates whether the algorithm can also learn theoretical values and optimal plays for non-deterministic games. Since the theoretical values of such games are expected win rates, not a simple win, loss, or draw, it is worthy investigating the ability of the AlphaZero algorithm to approximate expected win rates of positions. This paper also studies how the algorithm is influenced by a set of hyper-parameters. The tested non-deterministic game is a reduced and solved version of Chinese dark chess (CDC), called 2×4 CDC. The experiments show that the AlphaZero algorithm converges nearly to the theoretical values and the optimal plays in many of the settings of the hyper-parameters. To our knowledge, this is the first research paper that applies the AlphaZero algorithm to non-deterministic games.
AB - The AlphaZero algorithm, developed by DeepMind, achieved superhuman levels of play in the games of chess, shogi, and Go, by learning without domain-specific knowledge except game rules. This paper investigates whether the algorithm can also learn theoretical values and optimal plays for non-deterministic games. Since the theoretical values of such games are expected win rates, not a simple win, loss, or draw, it is worthy investigating the ability of the AlphaZero algorithm to approximate expected win rates of positions. This paper also studies how the algorithm is influenced by a set of hyper-parameters. The tested non-deterministic game is a reduced and solved version of Chinese dark chess (CDC), called 2×4 CDC. The experiments show that the AlphaZero algorithm converges nearly to the theoretical values and the optimal plays in many of the settings of the hyper-parameters. To our knowledge, this is the first research paper that applies the AlphaZero algorithm to non-deterministic games.
KW - AlphaZero
KW - Chinese dark chess
KW - Non-deterministic game
KW - Theoretical value
UR - http://www.scopus.com/inward/record.url?scp=85061448895&partnerID=8YFLogxK
U2 - 10.1109/TAAI.2018.00034
DO - 10.1109/TAAI.2018.00034
M3 - Conference contribution
AN - SCOPUS:85061448895
T3 - Proceedings - 2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018
SP - 116
EP - 121
BT - Proceedings - 2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018
Y2 - 30 November 2018 through 2 December 2018
ER -