TY - JOUR
T1 - Layered nonnegative matrix factorization for speech separation
AU - Hsu, Chung Chien
AU - Chien, Jen-Tzung
AU - Chi, Tai-Shih
N1 - Publisher Copyright:
Copyright © 2015 ISCA.
PY - 2015/9
Y1 - 2015/9
N2 - This paper proposes a layered nonnegative matrix factorization (L-NMF) algorithm for speech separation. The standard NMF method extracts parts-based bases out of nonnegative training data and is often used to separate mixed spectrograms. The proposed L-NMF algorithm comprises of several layers of standard NMF blocks. During training, each layer of the L-NMF is initialized separately and then fine-tuned by minimizing the propagated reconstruction error. More complicated bases of the training data are emerged in deeper layers of the L-NMF by progressively combining parts-based bases extracted in the first layer. In other words, these complicated bases contain collective information of the parts-based bases. The bases deciphered by all layers are then used to separate spectrograms in the conventional NMF way. Simulation results show the proposed LNMF outperforms the standard NMF in terms of the source-todistortion ratio (SDR).
AB - This paper proposes a layered nonnegative matrix factorization (L-NMF) algorithm for speech separation. The standard NMF method extracts parts-based bases out of nonnegative training data and is often used to separate mixed spectrograms. The proposed L-NMF algorithm comprises of several layers of standard NMF blocks. During training, each layer of the L-NMF is initialized separately and then fine-tuned by minimizing the propagated reconstruction error. More complicated bases of the training data are emerged in deeper layers of the L-NMF by progressively combining parts-based bases extracted in the first layer. In other words, these complicated bases contain collective information of the parts-based bases. The bases deciphered by all layers are then used to separate spectrograms in the conventional NMF way. Simulation results show the proposed LNMF outperforms the standard NMF in terms of the source-todistortion ratio (SDR).
KW - Dictionary learning
KW - Layered NMF
KW - NMF
KW - Speech separation
UR - http://www.scopus.com/inward/record.url?scp=84959087653&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2015-217
DO - 10.21437/Interspeech.2015-217
M3 - Conference article
AN - SCOPUS:84959087653
SN - 2308-457X
VL - 2015-January
SP - 628
EP - 632
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Y2 - 6 September 2015 through 10 September 2015
ER -