An Efficient and Low-Power MLP Accelerator Architecture Supporting Structured Pruning, Sparse Activations and Asymmetric Quantization for Edge Computing

Wei Chen Lin, Ya Chu Chang, Juinn Dar Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Multilayer perceptron (MLP) is one of the most popular neural network architectures used for classification, regression, and recommendation systems today. In this paper, we propose an efficient and low-power MLP accelerator for edge computing. The accelerator has three key features. First, it aligns with a novel structured weight pruning algorithm that merely needs minimal hardware support. Second, it takes advantage of activation sparsity for power minimization. Third, it supports asymmetric quantization on both weights and activations to boost the model accuracy especially when those values are in low-precision formats. Furthermore, the number of PEs is determined based on the available external memory bandwidth to ensure the high PE utilization, which avoids area and energy wastes. Experiment results show that the proposed MLP accelerator with only 8 MACs operates at 1.6GHz using the TSMC 40nm technology, delivers 899GOPS equivalent computing power after structured weight pruning on a well-known image classification model, and achieves an equivalent energy efficiency of 9.7TOPS/W, while the model accuracy loss is less than 0.3% with the help of asymmetric quantization.

Original languageEnglish
Title of host publication2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665419130
DOIs
StatePublished - 6 Jun 2021
Event3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021 - Washington, United States
Duration: 6 Jun 20219 Jun 2021

Publication series

Name2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021

Conference

Conference3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021
Country/TerritoryUnited States
CityWashington
Period6/06/219/06/21

Keywords

  • algorithm-hardware co-design
  • hardware accelerator
  • model compression
  • multilayer perceptron

Fingerprint

Dive into the research topics of 'An Efficient and Low-Power MLP Accelerator Architecture Supporting Structured Pruning, Sparse Activations and Asymmetric Quantization for Edge Computing'. Together they form a unique fingerprint.

Cite this