An Efficient and Low-Power MLP Accelerator Architecture Supporting Structured Pruning, Sparse Activations and Asymmetric Quantization for Edge Computing

Wei Chen Lin, Ya Chu Chang, Juinn Dar Huang

研究成果: Conference contribution同行評審

6 引文 斯高帕斯(Scopus)

摘要

Multilayer perceptron (MLP) is one of the most popular neural network architectures used for classification, regression, and recommendation systems today. In this paper, we propose an efficient and low-power MLP accelerator for edge computing. The accelerator has three key features. First, it aligns with a novel structured weight pruning algorithm that merely needs minimal hardware support. Second, it takes advantage of activation sparsity for power minimization. Third, it supports asymmetric quantization on both weights and activations to boost the model accuracy especially when those values are in low-precision formats. Furthermore, the number of PEs is determined based on the available external memory bandwidth to ensure the high PE utilization, which avoids area and energy wastes. Experiment results show that the proposed MLP accelerator with only 8 MACs operates at 1.6GHz using the TSMC 40nm technology, delivers 899GOPS equivalent computing power after structured weight pruning on a well-known image classification model, and achieves an equivalent energy efficiency of 9.7TOPS/W, while the model accuracy loss is less than 0.3% with the help of asymmetric quantization.

原文English
主出版物標題2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781665419130
DOIs
出版狀態Published - 6 6月 2021
事件3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021 - Washington, United States
持續時間: 6 6月 20219 6月 2021

出版系列

名字2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021

Conference

Conference3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021
國家/地區United States
城市Washington
期間6/06/219/06/21

指紋

深入研究「An Efficient and Low-Power MLP Accelerator Architecture Supporting Structured Pruning, Sparse Activations and Asymmetric Quantization for Edge Computing」主題。共同形成了獨特的指紋。

引用此