Corrective Guidance and Learning for Dialogue Management

Mahdin Rohmatillah, Jen Tzung Chien

研究成果: Conference contribution同行評審

8 引文 斯高帕斯(Scopus)

摘要

Establishing robust dialogue policy with low computation cost is challenging, especially for multi-domain task-oriented dialogue management due to the high complexity in state and action spaces. The previous works mostly using the deterministic policy optimization only attain moderate performance. Meanwhile, state-of-the-art result that uses end-to-end approach is computationally demanding since it utilizes a large-scaled language model based on the generative pre-trained transformer-2 (GPT-2). In this study, a new learning procedure consisting of three learning stages is presented to improve multi-domain dialogue management with corrective guidance. Firstly, the behavior cloning with an auxiliary task is developed to build a robust pre-trained model by mitigating the causal confusion problem in imitation learning. Next, the pre-trained model is rectified by using reinforcement learning via the proximal policy optimization. Lastly, human-in-the-loop learning strategy is fulfilled to enhance the agent performance by directly providing corrective feedback from rule-based agent so that the agent is prevented to trap in confounded states. The experiments on end-to-end evaluation show that the proposed learning method achieves state-of-the-art result by performing nearly identical to the rule-based agent. This method outperforms the second place of 9th dialog system technology challenge (DSTC9) track 2 that uses GPT-2 as the core model in dialogue management.

原文English
主出版物標題CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
發行者Association for Computing Machinery
頁面1548-1557
頁數10
ISBN(電子)9781450384469
DOIs
出版狀態Published - 26 10月 2021
事件30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, 澳大利亞
持續時間: 1 11月 20215 11月 2021

出版系列

名字International Conference on Information and Knowledge Management, Proceedings

Conference

Conference30th ACM International Conference on Information and Knowledge Management, CIKM 2021
國家/地區澳大利亞
城市Virtual, Online
期間1/11/215/11/21

指紋

深入研究「Corrective Guidance and Learning for Dialogue Management」主題。共同形成了獨特的指紋。

引用此