TY - JOUR
T1 - Self attention in variational sequential learning for summarization
AU - Chien, Jen Tzung
AU - Wang, Chun Wei
N1 - Publisher Copyright:
Copyright © 2019 ISCA
PY - 2019
Y1 - 2019
N2 - Attention mechanism plays a crucial role in sequential learning for many speech and language applications. However, it is challenging to develop a stochastic attention in a sequence-to-sequence model which consists of two recurrent neural networks (RNNs) as the encoder and decoder. The problem of posterior collapse happens in variational inference and results in the estimated latent variables close to a standard Gaussian prior so that the information from input sequence is disregarded in learning process. This paper presents a new recurrent autoencoder for sentence representation where a self attention scheme is incorporated to activate the interaction between inference and generation in training procedure. In particular, a stochastic RNN decoder is implemented to provide additional latent variable to fulfill self attention for sentence reconstruction. The posterior collapse is alleviated. The latent information is sufficiently attended in variational sequential learning. During test phase, the estimated prior distribution of decoder is sampled for stochastic attention and generation. Experiments on Penn Tree-bank and Yelp 2013 show the desirable generation performance in terms of perplexity. The visualization of attention weights also illustrates the usefulness of self attention. The evaluation on DUC 2007 demonstrates the merit of variational recurrent autoencoder for document summarization.
AB - Attention mechanism plays a crucial role in sequential learning for many speech and language applications. However, it is challenging to develop a stochastic attention in a sequence-to-sequence model which consists of two recurrent neural networks (RNNs) as the encoder and decoder. The problem of posterior collapse happens in variational inference and results in the estimated latent variables close to a standard Gaussian prior so that the information from input sequence is disregarded in learning process. This paper presents a new recurrent autoencoder for sentence representation where a self attention scheme is incorporated to activate the interaction between inference and generation in training procedure. In particular, a stochastic RNN decoder is implemented to provide additional latent variable to fulfill self attention for sentence reconstruction. The posterior collapse is alleviated. The latent information is sufficiently attended in variational sequential learning. During test phase, the estimated prior distribution of decoder is sampled for stochastic attention and generation. Experiments on Penn Tree-bank and Yelp 2013 show the desirable generation performance in terms of perplexity. The visualization of attention weights also illustrates the usefulness of self attention. The evaluation on DUC 2007 demonstrates the merit of variational recurrent autoencoder for document summarization.
KW - Attention mechanism
KW - Sequence generation
KW - Sequence-to-sequence learning
KW - Variational autoencoder
UR - http://www.scopus.com/inward/record.url?scp=85079549014&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2019-1548
DO - 10.21437/Interspeech.2019-1548
M3 - Conference article
AN - SCOPUS:85079549014
SN - 2308-457X
VL - 2019-September
SP - 1318
EP - 1322
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019
Y2 - 15 September 2019 through 19 September 2019
ER -