Stochastic convolutional recurrent networks for language modeling

Jen Tzung Chien, Yu Min Huang

    研究成果: Conference article同行評審

    摘要

    Sequential learning using recurrent neural network (RNN) has been popularly developed for language modeling. An alternative sequential learning was implemented by the temporal convolutional network (TCN) which is seen as a variant of one-dimensional convolutional neural network (CNN). In general, RNN and TCN are fitted to capture the long-term and the short-term features over natural sentences, respectively. This paper is motivated to fulfill TCN as the encoder to extract short-term dependencies and then use RNN as the decoder for language modeling where the dependencies are integrated in a long-term semantic fashion for word prediction. A new sequential learning based on the convolutional recurrent network (CRN) is developed to characterize the local dependencies as well as the global semantics in word sequences. Importantly, the stochastic modeling for CRN is proposed to facilitate model capacity in neural language model where the uncertainties in training sentences are represented for variational inference. The complementary benefits of CNN and RNN are merged in sequential learning where the latent variable space is constructed as a generative model for sequential prediction. Experiments on language modeling demonstrate the effectiveness of stochastic convolutional recurrent network relative to the other sequential machines in terms of perplexity and word error rate.

    原文English
    頁(從 - 到)3640-3644
    頁數5
    期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
    2020-October
    DOIs
    出版狀態Published - 2020
    事件21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
    持續時間: 25 十月 202029 十月 2020

    指紋

    深入研究「Stochastic convolutional recurrent networks for language modeling」主題。共同形成了獨特的指紋。

    引用此