Voice activity detection based on frequency modulation of harmonics

Chung Chien Hsu, Tse En Lin, Jian Hueng Chen, Tai-Shih Chi

研究成果: Conference contribution同行評審

14 引文 斯高帕斯(Scopus)

摘要

In this paper, we propose a voice activity detection (VAD) algorithm based on spectro-temporal modulation structures of input sounds. A multi-resolution spectro-temporal analysis framework is used to inspect prominent speech structures. By comparing with an adaptive threshold, the proposed VAD distinguishes speech from non-speech based on the energy of the frequency modulation of harmonics. Compared with three standard VADs, ITU-T G.729B, ETSI AMR1 and AMR2, our proposed VAD significantly outperforms them in non-stationary noises in terms of the receiver operating characteristic (ROC) curves and the recognition rates from a practical distributed speech recognition (DSR) system.

原文English
主出版物標題2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
頁面6679-6683
頁數5
DOIs
出版狀態Published - 18 10月 2013
事件2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, 加拿大
持續時間: 26 5月 201331 5月 2013

出版系列

名字ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(列印)1520-6149

Conference

Conference2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
國家/地區加拿大
城市Vancouver, BC
期間26/05/1331/05/13

指紋

深入研究「Voice activity detection based on frequency modulation of harmonics」主題。共同形成了獨特的指紋。

引用此