Energy-Efficient Accelerator Design With Tile-Based Row-Independent Compressed Memory for Sparse Compressed Convolutional Neural Networks

Po-Tsang Huang, I-Chen Wu, Chin-Yang Lo, Wei Hwang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Deep convolutional neural networks (CNNs) are difficult to be fully deployed to edge devices because of both memory-intensive and computation-intensive workloads. The energy efficiency of CNNs is dominated by convolution computation and off-chip memory (DRAM) accesses, especially for DRAM accesses. In this article, an energy-efficient accelerator is proposed for sparse compressed CNNs by reducing DRAM accesses and eliminating zero-operand computation. Weight compression is utilized for sparse compressed CNNs to reduce the required memory capacity/bandwidth and a large portion of connections. Thus, a tile-based row-independent compression (TRC) method with relative indexing memory is adopted for storing none-zero terms. Additionally, the workloads are distributed based on channels to increase the degree of task parallelism, and all-row-to-all-row non-zero element multiplication is adopted for skipping redundant computation. The simulation results over the dense accelerator show that the proposed accelerator achieves 1.79× speedup and reduces 23.51%, 69.53%, 88.67% on-chip memory size, energy, and DRAM accesses of VGG-16.
Original languageAmerican English
Pages (from-to)131=143
JournalIEEE Open Journal of Circuits and Systems
Volume2
DOIs
StatePublished - Jan 2021

Keywords

  • Sparse
  • CNN
  • relative
  • indexing
  • memory

Fingerprint

Dive into the research topics of 'Energy-Efficient Accelerator Design With Tile-Based Row-Independent Compressed Memory for Sparse Compressed Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this