An Empirical Analysis of Gumbel MuZero on Stochastic and Deterministic Einstein Würfelt Nicht!

Chien Liang Kuo, Po Ting Chen, Hung Guei, De Rong Sung, Chu Hsuan Hsueh, Ti Rong Wu*, I. Chen Wu

*此作品的通信作者

研究成果: Conference contribution同行評審

摘要

MuZero and its successors, Gumbel MuZero and Stochastic MuZero, have achieved superhuman performance in many domains. MuZero combines Monte Carlo tree search and model-based reinforcement learning, which allows it to be utilized in complex environments without prior knowledge of actual dynamics. Gumbel MuZero enhances the training quality of MuZero by guaranteeing policy improvement, which allows it to learn with a limited number of simulations for tree search. Stochastic MuZero broadens the applicable domains using a redesigned model, which allows it to cope with stochastic environments. Recently, an approach combining Gumbel MuZero and Stochastic MuZero was applied to a stochastic game called 2048, discovering a counterintuitive phenomenon: agents trained with only 3 simulations performed better than agents trained with 16 or 50 simulations. However, this phenomenon has only been observed in 2048 and awaits further investigations. This paper aims to examine two questions, namely Question 1: whether this phenomenon also happens in another well-known stochastic game, EinStein würfelt nicht! (EWN), and Question 2: whether the stochasticity of the environment is the main reason for the phenomenon. To investigate these questions, this paper analyzes the training results using stochastic EWN and four deterministic EWN variants. The experiments confirm that the phenomenon also happens in the stochastic EWN, while not in the deterministic variants, suggesting that stochasticity leads to better performance of agents trained with lower simulations.

原文English
主出版物標題Technologies and Applications of Artificial Intelligence - 28th International Conference, TAAI 2023, Proceedings
編輯Chao-Yang Lee, Chun-Li Lin, Hsuan-Ting Chang
發行者Springer Science and Business Media Deutschland GmbH
頁面329-342
頁數14
ISBN(列印)9789819717101
DOIs
出版狀態Published - 2024
事件28th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2023 - Yunlin, 台灣
持續時間: 1 12月 20232 12月 2023

出版系列

名字Communications in Computer and Information Science
2074 CCIS
ISSN(列印)1865-0929
ISSN(電子)1865-0937

Conference

Conference28th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2023
國家/地區台灣
城市Yunlin
期間1/12/232/12/23

指紋

深入研究「An Empirical Analysis of Gumbel MuZero on Stochastic and Deterministic Einstein Würfelt Nicht!」主題。共同形成了獨特的指紋。

引用此