Abstract
Hybrid deep neural networks (DNNs) are utilized for various embedded applications and composed of convolution layers, fully connected layers and recurrent layers. However, hybrid-DNNs are difficult to be fully deployed to edge devices due to both memory-intensive and computation-intensive workloads. Additionally, the resource utilization of a hybrid-DNN accelerator is less than those of CNN accelerators according to various computation kernels and dataflows. Fortunately, 3D-SRAM cubes by TSV 3D-stacking technologies provide promising solutions and become feasible for on-device DNN accelerators. In this paper, a flexible interconnect architecture, 3D cross-ring, and an efficient dataflow are proposed with 3D-SRAM cubes for an energy-efficient hybrid-DNN accelerator. Micro-routers of 3D cross-rings are designed to decrease the power of on-chip data movement about 5times compared to conventional 3D routers. Moreover, the efficient dataflow with dynamic workload distribution is designed based on 3D cross-rings for supporting different NN layers and models. The proposed accelerator can achieve higher than 90% PE utilization and reduce the DRAM accesses about 6times on different DNN models. This accelerator improves the overall energy efficiency up to 17.4times on VGG-16 compared to other state-of-art CNN accelerators.
Original language | English |
---|---|
Pages (from-to) | 776-778 |
Number of pages | 3 |
Journal | IEEE Journal on Emerging and Selected Topics in Circuits and Systems |
Volume | 11 |
Issue number | 4 |
DOIs | |
State | Published - Dec 2021 |
Keywords
- 3D-SRAM
- 3D-memory
- Hybrid-DNN
- dynamic workload distribution
- interconnection architecture