Dataflow and microarchitecture co-optimisation for sparse CNN on distributed processing element accelerator

Duc-An Pham, Bo-Cheng Lai*

*此作品的通信作者

研究成果: Article同行評審

1 引文 斯高帕斯(Scopus)

摘要

Accelerators that utilise the sparsity of both activation data and network structure of convolutional neural networks (CNNs) have demonstrated efficient processing of CNNs with superior performance. Previous research studies have shown three critical design concerns when designing accelerators for sparse CNNs, including data reuse, parallel computing performance, and effective sparse computation. These factors were each used in the previous accelerator designs, but none of the designs have considered all the factors at the same time. This study provides analytical approaches and experimental results to reveal the insight of accelerator design for sparse CNNs. The authors have shown that the architectural aspects need to be all considered to avoid performance pitfalls, including their mutual effects. Based on the proposed analytical approach, they proposed enhancement techniques and co-designed among the factors discussed in this study. The improved architecture shows up to 1.5x data reuse and/or 1.55x performance improvement in comparison with state-of-the-art sparse CNN accelerators while still maintaining equal area and energy cost.

原文English
頁(從 - 到)1185-1194
頁數10
期刊IET Circuits, Devices and Systems
14
發行號8
DOIs
出版狀態Published - 11月 2020

指紋

深入研究「Dataflow and microarchitecture co-optimisation for sparse CNN on distributed processing element accelerator」主題。共同形成了獨特的指紋。

引用此