A 2.25 TOPS/W fully-integrated deep CNN learning processor with on-chip training

Cheng Hsun Lu, Yi Chung Wu, Chia Hsiang Yang

研究成果: Conference contribution同行評審

19 引文 斯高帕斯(Scopus)

摘要

This paper presents a deep learning processor that supports both inference and training for the entire convolutional neural network (CNN) with any size. The proposed design enables on-chip training for applications that ask for high security and privacy. Techniques across design abstraction are applied to improve the energy efficiency. Rearrangement of the weights in filters is leveraged to reduce the processing latency by 88%. Integration of fixed-point and floating-point arithmetics reduces the area of the multiplier by 56.8%, resulting in an unified processing element (PE) with 33% less area. In the low-precision mode, clock gating and data gating are employed to reduce the power of the PE cluster by 62%. Maxpooling and ReLU modules are co-designed to reduce the memory usage by 75%. A modified softmax function is utilized to reduce the area by 78%. Fabricated in 40nm CMOS, the chip consumes 18.7 mW and 64.5 mW for inference and training, respectively, at 82 MHz from a 0.6V supply. It achieves an energy efficiency of 2.25 TOPS/W, which is 2.67 times higher than the state-of-the-art learning processors. The chip also achieves a 2×10 5 times higher energy efficiency in training than a high-end CPU.

原文English
主出版物標題Proceedings - 2019 IEEE Asian Solid-State Circuits Conference, A-SSCC 2019
發行者Institute of Electrical and Electronics Engineers Inc.
頁面65-68
頁數4
ISBN(電子)9781728151069
DOIs
出版狀態Published - 11月 2019
事件15th IEEE Asian Solid-State Circuits Conference, A-SSCC 2019 - Macao, China
持續時間: 4 11月 20196 11月 2019

出版系列

名字Proceedings - 2019 IEEE Asian Solid-State Circuits Conference, A-SSCC 2019
2019-November

Conference

Conference15th IEEE Asian Solid-State Circuits Conference, A-SSCC 2019
國家/地區China
城市Macao
期間4/11/196/11/19

指紋

深入研究「A 2.25 TOPS/W fully-integrated deep CNN learning processor with on-chip training」主題。共同形成了獨特的指紋。

引用此