Stochastic curiosity exploration for dialogue systems

Jen Tzung Chien, Po Chien Hsu

研究成果: Conference article同行評審

11 引文 斯高帕斯(Scopus)

摘要

Traditionally, task-oriented dialogue system is built by an autonomous agent which can be trained by reinforcement learning where the reward from environment is maximized. The agent is learned by updating the policy when the goal state is observed. However, in real world, the extrinsic reward is usually sparse or missing. The training efficiency is bounded. The system performance is degraded. It is challenging to tackle the issue of sample efficiency in sparse reward scenario for spoken dialogues. Accordingly, a dialogue agent needs additional information to update its policy even in the period when reward is absent in the environment. This paper presents a new dialogue agent which is learned by incorporating the intrinsic reward based on the information-theoretic approach via stochastic curiosity exploration. This agent encourages the exploration for future diversity based on a latent dynamic architecture which consists of encoder network, curiosity network, information network and policy network. The latent states and actions are drawn to predict stochastic transition for future. The curiosity learning are implemented with intrinsic reward in a metric of mutual information and prediction error in the predicted states and actions. Experiments on dialogue management using PyDial demonstrate the benefit by using the stochastic curiosity exploration.

原文English
頁(從 - 到)3885-3889
頁數5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2020-October
DOIs
出版狀態Published - 2020
事件21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, 中國
持續時間: 25 10月 202029 10月 2020

指紋

深入研究「Stochastic curiosity exploration for dialogue systems」主題。共同形成了獨特的指紋。

引用此