Solving hard-exploration problems with counting and replay approach

Bo Ying Huang, Shi Chun Tsai*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The reinforcement learning agent has been very successful in many Atari 2600 games. However, while applied to a more complex and challenging environment, it is crucial to avoid falling into the local optimum, especially when the games contain many traps, ample action space, challenging scenarios, and sporadic successful episodes. In this case, using the intrinsic motivation method can easily fall into the local optimum. If the domain knowledge is excessively used, it is not applicable when encountering different game designs. Therefore, to enhance the agent's ability to explore and avoid catastrophic forgetting due to the fades of intrinsic motivation, a Trajectory Evaluation Module is developed and integrated with ideas from the Count-Based Exploration and Trajectory Replay method. Moreover, our approach is integrated very well with the Self Imitation Learning method and works effectively for hard-exploration video games. Our policy is also evaluated with two video games: Super Mario Bros and Sonic the Hedgehog. The experiment results show that our Trajectory Evaluation Module can help the agent pass through various obstacles and scenarios, and successfully break through all levels of Super Mario Bros.

Original languageEnglish
Article number104701
JournalEngineering Applications of Artificial Intelligence
Volume110
DOIs
StatePublished - Apr 2022

Keywords

  • Count-Based Exploration
  • Hard-exploration video games
  • Sonic the Hedgehog
  • Sparse reward
  • Super Mario Bros
  • The deep reinforcement learning

Fingerprint

Dive into the research topics of 'Solving hard-exploration problems with counting and replay approach'. Together they form a unique fingerprint.

Cite this