Multi-Scale Dynamic Fixed-Point Quantization and Training for Deep Neural Networks

Po Yuan Chen, Hung Che Lin, Jiun In Guo

研究成果: Conference contribution同行評審

1 引文 斯高帕斯(Scopus)

摘要

State-of-the-art deep neural networks often require extremely high computational power which results in the deployment of deep neural networks on embedded devices being impractical. Therefore, model quantization is important for the deployment of deep neural networks on edge devices. The purpose of this paper is to quantize the deep neural networks from high-precision to low-precision (e.g. INT8) dynamic fixed-point format at the layer-by-layer level quantization. In addition, we further improve the uniform dynamic fixed-point quantization to multi-scale dynamic fixed-point quantization for lower quantization loss. The proposed multi-scale dynamic fixed-point quantization scheme divides the quantization ranges into two regions, and each region is assigned different quantization levels and quantization parameters to better approximate the bell-shaped distributions. The proposed quantization pipeline is composed of post-training quantization followed by model fine-tuning which can keep the accuracy drop of the quantized model within 1% mean average precision (mAP). Furthermore, the proposed quantization and fine-tuning method can be combined with model pruning to obtain a compact and accurate deep neural network with low bit-width.

原文English
主出版物標題ISCAS 2023 - 56th IEEE International Symposium on Circuits and Systems, Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781665451093
DOIs
出版狀態Published - 2023
事件56th IEEE International Symposium on Circuits and Systems, ISCAS 2023 - Monterey, 美國
持續時間: 21 5月 202325 5月 2023

出版系列

名字Proceedings - IEEE International Symposium on Circuits and Systems
2023-May
ISSN(列印)0271-4310

Conference

Conference56th IEEE International Symposium on Circuits and Systems, ISCAS 2023
國家/地區美國
城市Monterey
期間21/05/2325/05/23

指紋

深入研究「Multi-Scale Dynamic Fixed-Point Quantization and Training for Deep Neural Networks」主題。共同形成了獨特的指紋。

引用此