Stochastic curiosity exploration for dialogue systems

Jen Tzung Chien, Po Chien Hsu

研究成果: Conference article同行評審

9 引文 斯高帕斯(Scopus)


Traditionally, task-oriented dialogue system is built by an autonomous agent which can be trained by reinforcement learning where the reward from environment is maximized. The agent is learned by updating the policy when the goal state is observed. However, in real world, the extrinsic reward is usually sparse or missing. The training efficiency is bounded. The system performance is degraded. It is challenging to tackle the issue of sample efficiency in sparse reward scenario for spoken dialogues. Accordingly, a dialogue agent needs additional information to update its policy even in the period when reward is absent in the environment. This paper presents a new dialogue agent which is learned by incorporating the intrinsic reward based on the information-theoretic approach via stochastic curiosity exploration. This agent encourages the exploration for future diversity based on a latent dynamic architecture which consists of encoder network, curiosity network, information network and policy network. The latent states and actions are drawn to predict stochastic transition for future. The curiosity learning are implemented with intrinsic reward in a metric of mutual information and prediction error in the predicted states and actions. Experiments on dialogue management using PyDial demonstrate the benefit by using the stochastic curiosity exploration.

頁(從 - 到)3885-3889
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版狀態Published - 2020
事件21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
持續時間: 25 10月 202029 10月 2020


深入研究「Stochastic curiosity exploration for dialogue systems」主題。共同形成了獨特的指紋。