Stochastic convolutional recurrent networks for language modeling

Jen Tzung Chien, Yu Min Huang

研究成果: Conference article同行評審

摘要

Sequential learning using recurrent neural network (RNN) has been popularly developed for language modeling. An alternative sequential learning was implemented by the temporal convolutional network (TCN) which is seen as a variant of one-dimensional convolutional neural network (CNN). In general, RNN and TCN are fitted to capture the long-term and the short-term features over natural sentences, respectively. This paper is motivated to fulfill TCN as the encoder to extract short-term dependencies and then use RNN as the decoder for language modeling where the dependencies are integrated in a long-term semantic fashion for word prediction. A new sequential learning based on the convolutional recurrent network (CRN) is developed to characterize the local dependencies as well as the global semantics in word sequences. Importantly, the stochastic modeling for CRN is proposed to facilitate model capacity in neural language model where the uncertainties in training sentences are represented for variational inference. The complementary benefits of CNN and RNN are merged in sequential learning where the latent variable space is constructed as a generative model for sequential prediction. Experiments on language modeling demonstrate the effectiveness of stochastic convolutional recurrent network relative to the other sequential machines in terms of perplexity and word error rate.

原文English
頁(從 - 到)3640-3644
頁數5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2020-October
DOIs
出版狀態Published - 2020
事件21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
持續時間: 25 十月 202029 十月 2020

指紋

深入研究「Stochastic convolutional recurrent networks for language modeling」主題。共同形成了獨特的指紋。

引用此