TY - GEN
T1 - An Energy-Efficient Ring-Based CIM Accelerator using High-Linearity eNVM for Deep Neural Networks
AU - Huang, Po-Tsang
AU - Liu, Ting Wei
AU - Lu, Wei
AU - Lin, Yu Hsien
AU - Hwang, Wei
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Computation-in-memory (CIM) accelerators reduce the energy consumption of weight accesses from off-chip memory by storing synaptic weights into on-chip embedded NVM (eNVM) devices, such as RRAM, charge-Trap transistor and FeFET. However, for mapping a deep neural network (DNN) more than 20 layers into an eNVM-based accelerator, the throughput, energy-efficiency and accuracy are limited due to the non-linearity of weights and energy-consuming weight updating. In this work, the proposed CIM accelerator exploits low-voltage and high-linearity eNVM to reach both the power efficiency of weight updating and the high accuracy. By adopting the layer-level weight stationary, mini-Array clusters and a ring-based architecture, the resource utilization of eNVM devices is increased. In addition, channel-wise weight mapping schemes for standard convolution and pointwise convolution can support the structure pruning technique of DNNs. The proposed accelerator achieves 1.814 TOPS/W with only 4.7% accuracy loss on YOLOv3 by Ni-crystal RRAM.
AB - Computation-in-memory (CIM) accelerators reduce the energy consumption of weight accesses from off-chip memory by storing synaptic weights into on-chip embedded NVM (eNVM) devices, such as RRAM, charge-Trap transistor and FeFET. However, for mapping a deep neural network (DNN) more than 20 layers into an eNVM-based accelerator, the throughput, energy-efficiency and accuracy are limited due to the non-linearity of weights and energy-consuming weight updating. In this work, the proposed CIM accelerator exploits low-voltage and high-linearity eNVM to reach both the power efficiency of weight updating and the high accuracy. By adopting the layer-level weight stationary, mini-Array clusters and a ring-based architecture, the resource utilization of eNVM devices is increased. In addition, channel-wise weight mapping schemes for standard convolution and pointwise convolution can support the structure pruning technique of DNNs. The proposed accelerator achieves 1.814 TOPS/W with only 4.7% accuracy loss on YOLOv3 by Ni-crystal RRAM.
UR - http://www.scopus.com/inward/record.url?scp=85123353456&partnerID=8YFLogxK
U2 - 10.1109/ISOCC53507.2021.9613978
DO - 10.1109/ISOCC53507.2021.9613978
M3 - Conference contribution
AN - SCOPUS:85123353456
T3 - Proceedings - International SoC Design Conference 2021, ISOCC 2021
SP - 260
EP - 261
BT - Proceedings - International SoC Design Conference 2021, ISOCC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th International System-on-Chip Design Conference, ISOCC 2021
Y2 - 6 October 2021 through 9 October 2021
ER -