TY - GEN
T1 - A 2.17mW Acoustic DSP Processor with CNN-FFT Accelerators for Intelligent Hearing Aided Devices
AU - Lee, Yu Chi
AU - Chi, Tai Shih
AU - Yang, Chia Hsiang
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/3
Y1 - 2019/3
N2 - This paper proposes an acoustic DSP processor with a neural network core for speech enhancement. Accelerators for convolutional neural network (CNN) and fast Fourier transform (FFT) are embedded. The CNN-based speech enhancement algorithm takes the speech signals spectrogram as the model's input, and predicts the desired mask of speech to enhance speech intelligibility after passing through the CNN model. An array of multiply-accumulator (MAC) and coordinate rotation digital computer (CORDIC) engines are deployed to efficiently compute linear and nonlinear functions. Hardware sharing is applied to reduce hardware area by leveraging the high similarity between CNN and FFT computations. The proposed DSP processor chip is fabricated in a 40-nm CMOS technology with a core area of 4.3 mm2. The chip's power dissipation is 2.17 mW at an operating frequency of 5 MHz. The CNN accelerator supports both convolutional and fully-connected layers and achieves an energy efficiency of 1200-to-2180 GOPS/W, despite the flexibility for FFT. The speech intelligibility can be enhanced by up to 41% under low SNR conditions.
AB - This paper proposes an acoustic DSP processor with a neural network core for speech enhancement. Accelerators for convolutional neural network (CNN) and fast Fourier transform (FFT) are embedded. The CNN-based speech enhancement algorithm takes the speech signals spectrogram as the model's input, and predicts the desired mask of speech to enhance speech intelligibility after passing through the CNN model. An array of multiply-accumulator (MAC) and coordinate rotation digital computer (CORDIC) engines are deployed to efficiently compute linear and nonlinear functions. Hardware sharing is applied to reduce hardware area by leveraging the high similarity between CNN and FFT computations. The proposed DSP processor chip is fabricated in a 40-nm CMOS technology with a core area of 4.3 mm2. The chip's power dissipation is 2.17 mW at an operating frequency of 5 MHz. The CNN accelerator supports both convolutional and fully-connected layers and achieves an energy efficiency of 1200-to-2180 GOPS/W, despite the flexibility for FFT. The speech intelligibility can be enhanced by up to 41% under low SNR conditions.
KW - CMOS integrated circuits
KW - Speech Enhancement
KW - convolutional neural network (CNN)
KW - fast Fourier transform (FFT)
KW - reconfigurable architecture
UR - http://www.scopus.com/inward/record.url?scp=85070477956&partnerID=8YFLogxK
U2 - 10.1109/AICAS.2019.8771631
DO - 10.1109/AICAS.2019.8771631
M3 - Conference contribution
AN - SCOPUS:85070477956
T3 - Proceedings 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2019
SP - 97
EP - 101
BT - Proceedings 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1st IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2019
Y2 - 18 March 2019 through 20 March 2019
ER -