TY - JOUR
T1 - Stochastic recurrent neural network for speech recognition
AU - Chien, Jen-Tzung
AU - Shen, Chen
PY - 2017/1/1
Y1 - 2017/1/1
N2 - This paper presents a new stochastic learning approach to construct a latent variable model for recurrent neural network (RNN) based speech recognition. A hybrid generative and discriminative stochastic network is implemented to build a deep classification model. In the implementation, we conduct stochastic modeling for hidden states of recurrent neural network based on the variational auto-encoder. The randomness of hidden neurons is represented by the Gaussian distribution with mean and variance parameters driven by neural weights and learned from variational inference. Importantly, the class labels of input speech frames are incorporated to regularize this deep model to sample the informative and discriminative features for reconstruction of classification outputs. We accordingly propose the stochastic RNN (SRNN) to reflect the probabilistic property in RNN classification system. A stochastic error backpropagation algorithm is implemented. The experiments on speech recognition using TIMIT and Aurora4 show the merit of the proposed SRNN.
AB - This paper presents a new stochastic learning approach to construct a latent variable model for recurrent neural network (RNN) based speech recognition. A hybrid generative and discriminative stochastic network is implemented to build a deep classification model. In the implementation, we conduct stochastic modeling for hidden states of recurrent neural network based on the variational auto-encoder. The randomness of hidden neurons is represented by the Gaussian distribution with mean and variance parameters driven by neural weights and learned from variational inference. Importantly, the class labels of input speech frames are incorporated to regularize this deep model to sample the informative and discriminative features for reconstruction of classification outputs. We accordingly propose the stochastic RNN (SRNN) to reflect the probabilistic property in RNN classification system. A stochastic error backpropagation algorithm is implemented. The experiments on speech recognition using TIMIT and Aurora4 show the merit of the proposed SRNN.
KW - Neural network
KW - Speech recognition
KW - Stochastic error backpropagation
KW - Variational inference
UR - http://www.scopus.com/inward/record.url?scp=85039155226&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2017-856
DO - 10.21437/Interspeech.2017-856
M3 - Conference article
AN - SCOPUS:85039155226
SN - 2308-457X
VL - 2017-August
SP - 1313
EP - 1317
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017
Y2 - 20 August 2017 through 24 August 2017
ER -