AlphaZero for a Non-Deterministic Game

Chu Hsuan Hsueh, I-Chen Wu, Jr Chang Chen, Tsan Sheng Hsu

研究成果: Conference contribution同行評審

9 引文 斯高帕斯(Scopus)

摘要

The AlphaZero algorithm, developed by DeepMind, achieved superhuman levels of play in the games of chess, shogi, and Go, by learning without domain-specific knowledge except game rules. This paper investigates whether the algorithm can also learn theoretical values and optimal plays for non-deterministic games. Since the theoretical values of such games are expected win rates, not a simple win, loss, or draw, it is worthy investigating the ability of the AlphaZero algorithm to approximate expected win rates of positions. This paper also studies how the algorithm is influenced by a set of hyper-parameters. The tested non-deterministic game is a reduced and solved version of Chinese dark chess (CDC), called 2×4 CDC. The experiments show that the AlphaZero algorithm converges nearly to the theoretical values and the optimal plays in many of the settings of the hyper-parameters. To our knowledge, this is the first research paper that applies the AlphaZero algorithm to non-deterministic games.

原文English
主出版物標題Proceedings - 2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018
發行者Institute of Electrical and Electronics Engineers Inc.
頁面116-121
頁數6
ISBN(電子)9781728112299
DOIs
出版狀態Published - 24 12月 2018
事件2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018 - Taichung, 台灣
持續時間: 30 11月 20182 12月 2018

出版系列

名字Proceedings - 2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018

Conference

Conference2018 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2018
國家/地區台灣
城市Taichung
期間30/11/182/12/18

指紋

深入研究「AlphaZero for a Non-Deterministic Game」主題。共同形成了獨特的指紋。

引用此