Stochastic convolutional recurrent networks for language modeling

Jen Tzung Chien, Yu Min Huang

研究成果: Conference article同行評審

1 引文 斯高帕斯(Scopus)


Sequential learning using recurrent neural network (RNN) has been popularly developed for language modeling. An alternative sequential learning was implemented by the temporal convolutional network (TCN) which is seen as a variant of one-dimensional convolutional neural network (CNN). In general, RNN and TCN are fitted to capture the long-term and the short-term features over natural sentences, respectively. This paper is motivated to fulfill TCN as the encoder to extract short-term dependencies and then use RNN as the decoder for language modeling where the dependencies are integrated in a long-term semantic fashion for word prediction. A new sequential learning based on the convolutional recurrent network (CRN) is developed to characterize the local dependencies as well as the global semantics in word sequences. Importantly, the stochastic modeling for CRN is proposed to facilitate model capacity in neural language model where the uncertainties in training sentences are represented for variational inference. The complementary benefits of CNN and RNN are merged in sequential learning where the latent variable space is constructed as a generative model for sequential prediction. Experiments on language modeling demonstrate the effectiveness of stochastic convolutional recurrent network relative to the other sequential machines in terms of perplexity and word error rate.

頁(從 - 到)3640-3644
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版狀態Published - 2020
事件21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
持續時間: 25 10月 202029 10月 2020


深入研究「Stochastic convolutional recurrent networks for language modeling」主題。共同形成了獨特的指紋。