High-speed power-efficient coarse-grained convolver architecture using depth-first compression scheme

Yi Lin Wu, Yi Lu, Juinn Dar Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Convolutional neural networks (CNNs) have been playing an important role in various applications, e.g., computer vision. Since CNN computations require numerous multiply-accumulate (MAC) operations, how to get them done efficiently is a crucial issue for CNN hardware accelerators. In this paper, we propose a high-speed power-efficient convolver architecture for CNN acceleration. A 3×3 convolver is asked to produce an output every cycle and is commonly accomplished by summing up the results of nine parallel multiplications, which requires ten carry-propagation adders (CPAs) in total. However, the proposed coarse-grained convolver can break the boundary between multipliers and reduce all partial products in a more global way. Consequently, it requires only one CPA to generate the final outcome. It also features a globally delay-optimized partial product reduction tree and a depth-first compression scheme for both area and power minimization. The proposed convolver has been implemented using TSMC 40nm technology. Compared to a conventional 3×3 convolver baseline design, our design can reduce area and power by 15.8% and 26.5% respectively at the clock rate of 1GHz.

Original languageEnglish
Title of host publication2020 IEEE International Symposium on Circuits and Systems, ISCAS 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728133201
DOIs
StatePublished - Oct 2020
Event52nd IEEE International Symposium on Circuits and Systems, ISCAS 2020 - Virtual, Online
Duration: 10 Oct 202021 Oct 2020

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
Volume2020-October
ISSN (Print)0271-4310

Conference

Conference52nd IEEE International Symposium on Circuits and Systems, ISCAS 2020
CityVirtual, Online
Period10/10/2021/10/20

Keywords

  • Compensation vector
  • Convolutional neural network (CNN)
  • Convolver design
  • Depth-first compression
  • Hardware accelerator
  • Multiply-accumulate operation

Fingerprint

Dive into the research topics of 'High-speed power-efficient coarse-grained convolver architecture using depth-first compression scheme'. Together they form a unique fingerprint.

Cite this