TY - GEN
T1 - An Evaluation and Architecture Exploration Engine for CNN Accelerators through Extensive Dataflow Analysis
AU - Chou, Shan Hui
AU - Hsiao, Ting Yun
AU - Jou, Jing Yang
AU - Huang, Juinn Dar
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Systolic array is one of the popular convolutional neural network accelerator architectures due to its high computation efficiency. Nevertheless, the huge design space and complicated interactions among different design parameters make it hard to find the best configuration for various applications. To overcome this issue, this paper presents an evaluation and design space exploration engine, NNeed, for systolic-array CNN accelerators through extensive dataflow analysis. It uses a highly configurable hardware template to describe accelerator operations in detail. The rapid evaluation provides PPA results, pipeline stage analysis, external memory access statistics, and so on. NNeed explores the 9-dimensional design space and supports multiple objective functions for design optimization. Experimental results show that NNeed can generate an accelerator configuration with up to 23% and 50% improvement in performance and energy as compared with a typical handcrafted design.
AB - Systolic array is one of the popular convolutional neural network accelerator architectures due to its high computation efficiency. Nevertheless, the huge design space and complicated interactions among different design parameters make it hard to find the best configuration for various applications. To overcome this issue, this paper presents an evaluation and design space exploration engine, NNeed, for systolic-array CNN accelerators through extensive dataflow analysis. It uses a highly configurable hardware template to describe accelerator operations in detail. The rapid evaluation provides PPA results, pipeline stage analysis, external memory access statistics, and so on. NNeed explores the 9-dimensional design space and supports multiple objective functions for design optimization. Experimental results show that NNeed can generate an accelerator configuration with up to 23% and 50% improvement in performance and energy as compared with a typical handcrafted design.
UR - http://www.scopus.com/inward/record.url?scp=85179837423&partnerID=8YFLogxK
U2 - 10.1109/VLSI-SoC57769.2023.10321934
DO - 10.1109/VLSI-SoC57769.2023.10321934
M3 - Conference contribution
AN - SCOPUS:85179837423
T3 - IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC
BT - 2023 IFIP/IEEE 31st International Conference on Very Large Scale Integration, VLSI-SoC 2023
PB - IEEE Computer Society
T2 - 31st IFIP/IEEE International Conference on Very Large Scale Integration, VLSI-SoC 2023
Y2 - 16 October 2023 through 18 October 2023
ER -