TY - JOUR
T1 - A two-stage singing voice separation algorithm using spectro-temporal modulation features
AU - Yen, Frederick Z.
AU - Huang, Mao Chang
AU - Chi, Tai-Shih
N1 - Publisher Copyright:
Copyright © 2015 ISCA.
PY - 2015/9
Y1 - 2015/9
N2 - A two-stage singing voice separation algorithm using spectrotemporal modulation features is proposed in this paper. First, music clips are transformed into auditory spectrograms and the spectral-temporal modulation contents of all time-frequency (T-F) units of the auditory spectrograms are extracted using an auditory model. Then, T-F units are sequentially clustered using the expectation-maximization (EM) algorithm into percussive, harmonic and vocal units through the proposed two-stage algorithm. Lastly, the singing voice is synthesized from clustered vocal T-F units via time-frequency masking. The algorithm was evaluated using the MIR-1K dataset and demonstrated better separation results than our previously proposed one-stage algorithm.
AB - A two-stage singing voice separation algorithm using spectrotemporal modulation features is proposed in this paper. First, music clips are transformed into auditory spectrograms and the spectral-temporal modulation contents of all time-frequency (T-F) units of the auditory spectrograms are extracted using an auditory model. Then, T-F units are sequentially clustered using the expectation-maximization (EM) algorithm into percussive, harmonic and vocal units through the proposed two-stage algorithm. Lastly, the singing voice is synthesized from clustered vocal T-F units via time-frequency masking. The algorithm was evaluated using the MIR-1K dataset and demonstrated better separation results than our previously proposed one-stage algorithm.
KW - Auditory scene analysis
KW - Singing voice separation
KW - Spectro-temporal modulation
UR - http://www.scopus.com/inward/record.url?scp=84959123663&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2015-669
DO - 10.21437/Interspeech.2015-669
M3 - Conference article
AN - SCOPUS:84959123663
SN - 2308-457X
VL - 2015-January
SP - 3321
EP - 3324
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Y2 - 6 September 2015 through 10 September 2015
ER -