TY - JOUR
T1 - Discriminative layered nonnegative matrix factorization for speech separation
AU - Hsu, Chung Chien
AU - Chi, Tai-Shih
AU - Chien, Jen-Tzung
N1 - Publisher Copyright:
Copyright © 2016 ISCA.
PY - 2016
Y1 - 2016
N2 - This paper proposes a discriminative layered nonnegative matrix factorization (DL-NMF) for monaural speech separation. The standard NMF conducts the parts-based representation using a single-layer of bases which was recently upgraded to the layered NMF (L-NMF) where a tree of bases was estimated for multi-level or multi-aspect decomposition of a complex mixed signal. In this study, we develop the DL-NMF by extending the generative bases in L-NMF to the discriminative bases which are estimated according to a discriminative criterion. The discriminative criterion is conducted by optimizing the recovery of the mixed spectra from the separated spectra and minimizing the reconstruction errors between separated spectra and original source spectra. The experiments on single-channel speech separation show the superiority of DL-NMF to NMF and L-NMF in terms of the SDR, SIR and SAR measures.
AB - This paper proposes a discriminative layered nonnegative matrix factorization (DL-NMF) for monaural speech separation. The standard NMF conducts the parts-based representation using a single-layer of bases which was recently upgraded to the layered NMF (L-NMF) where a tree of bases was estimated for multi-level or multi-aspect decomposition of a complex mixed signal. In this study, we develop the DL-NMF by extending the generative bases in L-NMF to the discriminative bases which are estimated according to a discriminative criterion. The discriminative criterion is conducted by optimizing the recovery of the mixed spectra from the separated spectra and minimizing the reconstruction errors between separated spectra and original source spectra. The experiments on single-channel speech separation show the superiority of DL-NMF to NMF and L-NMF in terms of the SDR, SIR and SAR measures.
KW - Dictionary learning
KW - Discriminative learning
KW - Nonnegative matrix factorization
KW - Speech separation
UR - http://www.scopus.com/inward/record.url?scp=84994226065&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2016-415
DO - 10.21437/Interspeech.2016-415
M3 - Conference article
AN - SCOPUS:84994226065
SN - 2308-457X
VL - 08-12-September-2016
SP - 560
EP - 564
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
Y2 - 8 September 2016 through 16 September 2016
ER -