Minimum classification error based spectro-temporal feature extraction for robust audio classification

Yuan Fu Liao*, Chia Hsing Lin, We Der Fang

*此作品的通信作者

研究成果: Conference article同行評審

摘要

Mel-frequency cepstral coefficients (MFCCs) are the most popular features for automatic audio classification (AAC). However, MFCCs are often not robust in adverse environment. In this paper, a minimum classification error (MCE)-based method is proposed to extract new and robust spectro-temporal features as alternatives to MFCCs. The robustness of the proposed new features is evaluated on noisy non-speech sound of RWCP Sound Scene Database in Real Acoustic Environment database with Aurora 2 multi-condition training task-like settings. Experimental results show the proposed new features achieved the lowest average recognition error rate of 3.17% which is much better than state-of-the-art MFCCs plus mean subtraction, variance normalization and ARMA filtering (MFCC+MVA, 4.31%), Gabor filters with principle component analysis (Gabor+PCA, 4.43%) and linear discriminant analysis (LDA, 4.20%) features. We thus confirm the robustness of the proposed spectro-temporal feature extraction approach.

原文English
頁(從 - 到)241-244
頁數4
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版狀態Published - 2011
事件12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, 意大利
持續時間: 27 8月 201131 8月 2011

指紋

深入研究「Minimum classification error based spectro-temporal feature extraction for robust audio classification」主題。共同形成了獨特的指紋。

引用此