TY - GEN
T1 - Multitask Generative Adversarial Imitation Learning for Multi-Domain Dialogue System
AU - Hsu, Chuan En
AU - Rohmatillah, Mahdin
AU - Chien, Jen Tzung
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - In the task-oriented dialogue system, dialog policy plays an important role since it determines the suitable actions based on the user's goals. However, in real situations, user's goals are varying so that the system needs to deal with the complex optimization problem for dialog policy. This paper presents a novel approach to build the multi-domain dialog system based on the multitask generative adversarial imitation learning (MGAIL). MGAIL combines hierarchical reinforcement learning and generative adversarial imitation learning where a mixture of generators are represented for multitask learning. Unlike the traditional imitation learning, this method decomposes each of complex tasks into several subtasks and builds the policy in a hierarchical way to relax the agent in handling multiple complex tasks. Experiments on a multi-domain dialogue system using MultiWOZ 2.1 under ConvLab-2 frame-work show that the proposed method outperforms the other reinforcement learning methods in system-wise evaluation in terms of complete rate, success rate and book rate.
AB - In the task-oriented dialogue system, dialog policy plays an important role since it determines the suitable actions based on the user's goals. However, in real situations, user's goals are varying so that the system needs to deal with the complex optimization problem for dialog policy. This paper presents a novel approach to build the multi-domain dialog system based on the multitask generative adversarial imitation learning (MGAIL). MGAIL combines hierarchical reinforcement learning and generative adversarial imitation learning where a mixture of generators are represented for multitask learning. Unlike the traditional imitation learning, this method decomposes each of complex tasks into several subtasks and builds the policy in a hierarchical way to relax the agent in handling multiple complex tasks. Experiments on a multi-domain dialogue system using MultiWOZ 2.1 under ConvLab-2 frame-work show that the proposed method outperforms the other reinforcement learning methods in system-wise evaluation in terms of complete rate, success rate and book rate.
KW - Dialogue policy optimization
KW - generative adversarial imitation learning
KW - multi-domain dialogues
UR - http://www.scopus.com/inward/record.url?scp=85126717783&partnerID=8YFLogxK
U2 - 10.1109/ASRU51503.2021.9688234
DO - 10.1109/ASRU51503.2021.9688234
M3 - Conference contribution
AN - SCOPUS:85126717783
T3 - 2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings
SP - 954
EP - 961
BT - 2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021
Y2 - 13 December 2021 through 17 December 2021
ER -