TY - JOUR
T1 - Hybrid accelerated optimization for speech recognition
AU - Chien, Jen-Tzung
AU - Huang, Pei Wen
AU - Lee, Tan
N1 - Publisher Copyright:
Copyright © 2016 ISCA.
PY - 2016
Y1 - 2016
N2 - Optimization procedure is crucial to achieve desirable performance for speech recognition based on deep neural networks (DNNs). Conventionally, DNNs are trained by using mini-batch stochastic gradient descent (SGD) which is stable but prone to be trapped into local optimum. A recent work based on Nesterov's accelerated gradient descent (NAG) algorithm is developed by merging the current momentum information into correction of SGD updating. NAG less likely jumps into local minimum so that convergence rate is improved. In general, optimization based on SGD is more stable while that based on NAG is faster and more accurate. This study aims to boost the performance of speech recognition by combining complimentary SGD and NAG. A new hybrid optimization is proposed by integrating the SGD with momentum and the NAG by using an interpolation scheme which is continuously run in each mini-batch according to the change rate of cost function in consecutive two learning epochs. Tradeoff between two algorithms can be balanced for mini-batch optimization. Experiments on speech recognition using CUSENT and Aurora-4 show the effectiveness of the hybrid accelerated optimization in DNN acoustic model.
AB - Optimization procedure is crucial to achieve desirable performance for speech recognition based on deep neural networks (DNNs). Conventionally, DNNs are trained by using mini-batch stochastic gradient descent (SGD) which is stable but prone to be trapped into local optimum. A recent work based on Nesterov's accelerated gradient descent (NAG) algorithm is developed by merging the current momentum information into correction of SGD updating. NAG less likely jumps into local minimum so that convergence rate is improved. In general, optimization based on SGD is more stable while that based on NAG is faster and more accurate. This study aims to boost the performance of speech recognition by combining complimentary SGD and NAG. A new hybrid optimization is proposed by integrating the SGD with momentum and the NAG by using an interpolation scheme which is continuously run in each mini-batch according to the change rate of cost function in consecutive two learning epochs. Tradeoff between two algorithms can be balanced for mini-batch optimization. Experiments on speech recognition using CUSENT and Aurora-4 show the effectiveness of the hybrid accelerated optimization in DNN acoustic model.
KW - Deep neural network
KW - Hybrid optimization
KW - Speech recognition
KW - Stochastic gradient descent
UR - http://www.scopus.com/inward/record.url?scp=84994226700&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2016-192
DO - 10.21437/Interspeech.2016-192
M3 - Conference article
AN - SCOPUS:84994226700
SN - 2308-457X
VL - 08-12-September-2016
SP - 3399
EP - 3403
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
Y2 - 8 September 2016 through 16 September 2016
ER -