TY - GEN
T1 - An RRAM-Based 40.6 TOPS/W Energy-Efficient AI Inference Accelerator with Quad Neuromorphic-Processor-Unit for Highly Contrast Recognition
AU - Lin, Y. L.
AU - Liu, Y. R.
AU - Kao, T. C.
AU - Lee, M. Y.
AU - Guo, J. C.
AU - Hou, T. H.
AU - Chung, Steve S.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - We present a non-volatile edge deep neural network accelerator with a resistive-gate FinFET (RG-FinFET) memory and a parallel processor for edge AI inference. The RG-FinFET has the potential for 8-level operation. In the system, data storage and multiplication are carried out in the RG-FinFET array, and all the other operations are performed in a 4-core neuromorphic processing units (NPU). Quantization error is introduced into training stage through ex-Situ quantized training method, thus, the accuracy can still reach 97.24% and 80.18% respectively for MNIST and CIFAR-10 datasets while the parameter capacity is nearly 8x smaller. Eventually, the system's computation efficiency with 40.6 TOPS/w can be achieved, which is well-suited for the end-to-end integer-only AI-Inference hardware in CIM.
AB - We present a non-volatile edge deep neural network accelerator with a resistive-gate FinFET (RG-FinFET) memory and a parallel processor for edge AI inference. The RG-FinFET has the potential for 8-level operation. In the system, data storage and multiplication are carried out in the RG-FinFET array, and all the other operations are performed in a 4-core neuromorphic processing units (NPU). Quantization error is introduced into training stage through ex-Situ quantized training method, thus, the accuracy can still reach 97.24% and 80.18% respectively for MNIST and CIFAR-10 datasets while the parameter capacity is nearly 8x smaller. Eventually, the system's computation efficiency with 40.6 TOPS/w can be achieved, which is well-suited for the end-to-end integer-only AI-Inference hardware in CIM.
UR - http://www.scopus.com/inward/record.url?scp=85196721861&partnerID=8YFLogxK
U2 - 10.1109/VLSITSA60681.2024.10546404
DO - 10.1109/VLSITSA60681.2024.10546404
M3 - Conference contribution
AN - SCOPUS:85196721861
T3 - 2024 International VLSI Symposium on Technology, Systems and Applications, VLSI TSA 2024 - Proceedings
BT - 2024 International VLSI Symposium on Technology, Systems and Applications, VLSI TSA 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International VLSI Symposium on Technology, Systems and Applications, VLSI TSA 2024
Y2 - 22 April 2024 through 25 April 2024
ER -