Memory Bandwidth Efficient Design for Super-Resolution Accelerators With Structure Adaptive Fusion and Channel-Aware Addressing

An Jung Huang, Jo Hsuan Hung, Tian Sheuan Chang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

State-of-the-art (SOTA) super-resolution (SR) models can generate high-quality images. However, they require a large external memory bandwidth, making it impossible to implement these models on hardware. Although some work has presented different kinds of layer fusion to reduce memory traffic, they can only work on simple model architectures and only consider the feature extraction part. To solve the above issues, this article proposes structure adaptive fusion (SAF) for the feature extraction part to avoid intermediate feature map I/O. This method selects the repetitive structure as the fusion unit and fuses multiple ones to meet buffer size and memory bandwidth constraints, which can deal with different SR models. In addition, we also propose channel-aware addressing for the upscale part to avoid off-chip data transfers. The proposed methods achieve over 90% of memory traffic reduction in all tested SOTA models. Compared to the SOTA fusion method, our approach requires a 52% smaller buffer size and up to 61% lower memory bandwidth for the same number of fusion layers.

Original languageEnglish
Pages (from-to)802-811
Number of pages10
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume31
Issue number6
DOIs
StatePublished - 1 Jun 2023

Keywords

  • Convolutional neural networks (CNNs)
  • deep-learning accelerators (DLAs)
  • layer fusion
  • real-time
  • super-resolution (SR)

Fingerprint

Dive into the research topics of 'Memory Bandwidth Efficient Design for Super-Resolution Accelerators With Structure Adaptive Fusion and Channel-Aware Addressing'. Together they form a unique fingerprint.

Cite this