Sparse coding based music genre classification using spectro-temporal modulations

Kai Chun Hsu, Chih Shan Lin, Tai-Shih Chi

研究成果: Conference contribution同行評審

6 引文 斯高帕斯(Scopus)

摘要

Spectro-temporal modulations (STMs) of the sound convey timbre and rhythm information so that they are intuitively useful for automatic music genre classification. The STMs are usually extracted from a time-frequency representation of the acoustic signal. In this paper, we investigate the efficacy of two kinds of STM features, the Gabor features and the rate-scale (RS) features, selectively extracted from various time-frequency representations, including the short-time Fourier transform (STFT) spectrogram, the constant-Q transform (CQT) spectrogram and the auditory (AUD) spectrogram, in recognizing the music genre. In our system, the dictionary learning and sparse coding techniques are adopted for training the support vector machine (SVM) classifier. Both spectral-type features and modulation-type features are used to test the system. Experiment results show that the RS features extracted from the log. magnituded CQT spectrogram produce the highest recognition rate in classifying the music genre.

原文English
主出版物標題Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016
編輯Johanna Devaney, Douglas Turnbull, Michael I. Mandel, George Tzanetakis
發行者International Society for Music Information Retrieval
頁面744-750
頁數7
ISBN(電子)9780692755068
DOIs
出版狀態Published - 8月 2016
事件17th International Society for Music Information Retrieval Conference, ISMIR 2016 - New York, 美國
持續時間: 7 8月 201611 8月 2016

出版系列

名字Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016

Conference

Conference17th International Society for Music Information Retrieval Conference, ISMIR 2016
國家/地區美國
城市New York
期間7/08/1611/08/16

指紋

深入研究「Sparse coding based music genre classification using spectro-temporal modulations」主題。共同形成了獨特的指紋。

引用此