TY - JOUR
T1 - Stochastic convolutional recurrent networks for language modeling
AU - Chien, Jen Tzung
AU - Huang, Yu Min
N1 - Publisher Copyright:
Copyright © 2020 ISCA
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - Sequential learning using recurrent neural network (RNN) has been popularly developed for language modeling. An alternative sequential learning was implemented by the temporal convolutional network (TCN) which is seen as a variant of one-dimensional convolutional neural network (CNN). In general, RNN and TCN are fitted to capture the long-term and the short-term features over natural sentences, respectively. This paper is motivated to fulfill TCN as the encoder to extract short-term dependencies and then use RNN as the decoder for language modeling where the dependencies are integrated in a long-term semantic fashion for word prediction. A new sequential learning based on the convolutional recurrent network (CRN) is developed to characterize the local dependencies as well as the global semantics in word sequences. Importantly, the stochastic modeling for CRN is proposed to facilitate model capacity in neural language model where the uncertainties in training sentences are represented for variational inference. The complementary benefits of CNN and RNN are merged in sequential learning where the latent variable space is constructed as a generative model for sequential prediction. Experiments on language modeling demonstrate the effectiveness of stochastic convolutional recurrent network relative to the other sequential machines in terms of perplexity and word error rate.
AB - Sequential learning using recurrent neural network (RNN) has been popularly developed for language modeling. An alternative sequential learning was implemented by the temporal convolutional network (TCN) which is seen as a variant of one-dimensional convolutional neural network (CNN). In general, RNN and TCN are fitted to capture the long-term and the short-term features over natural sentences, respectively. This paper is motivated to fulfill TCN as the encoder to extract short-term dependencies and then use RNN as the decoder for language modeling where the dependencies are integrated in a long-term semantic fashion for word prediction. A new sequential learning based on the convolutional recurrent network (CRN) is developed to characterize the local dependencies as well as the global semantics in word sequences. Importantly, the stochastic modeling for CRN is proposed to facilitate model capacity in neural language model where the uncertainties in training sentences are represented for variational inference. The complementary benefits of CNN and RNN are merged in sequential learning where the latent variable space is constructed as a generative model for sequential prediction. Experiments on language modeling demonstrate the effectiveness of stochastic convolutional recurrent network relative to the other sequential machines in terms of perplexity and word error rate.
KW - Convolutional neural network
KW - Language model
KW - Latent variable model
KW - Recurrent neural network
UR - http://www.scopus.com/inward/record.url?scp=85098228693&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2020-1493
DO - 10.21437/Interspeech.2020-1493
M3 - Conference article
AN - SCOPUS:85098228693
SN - 2308-457X
VL - 2020-October
SP - 3640
EP - 3644
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
Y2 - 25 October 2020 through 29 October 2020
ER -