This paper proposes a layered nonnegative matrix factorization (L-NMF) algorithm for speech separation. The standard NMF method extracts parts-based bases out of nonnegative training data and is often used to separate mixed spectrograms. The proposed L-NMF algorithm comprises of several layers of standard NMF blocks. During training, each layer of the L-NMF is initialized separately and then fine-tuned by minimizing the propagated reconstruction error. More complicated bases of the training data are emerged in deeper layers of the L-NMF by progressively combining parts-based bases extracted in the first layer. In other words, these complicated bases contain collective information of the parts-based bases. The bases deciphered by all layers are then used to separate spectrograms in the conventional NMF way. Simulation results show the proposed LNMF outperforms the standard NMF in terms of the source-todistortion ratio (SDR).
|頁（從 - 到）||628-632|
|期刊||Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH|
|出版狀態||Published - 1 一月 2015|
|事件||16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany|
持續時間: 6 九月 2015 → 10 九月 2015