TY - JOUR
T1 - IVS-Caffe - Hardware-Oriented Neural Network Model Development
AU - Tsai, Chia Chi
AU - Guo, Jiun In
N1 - Publisher Copyright:
IEEE
PY - 2021/7
Y1 - 2021/7
N2 - This article proposes a hardware-oriented neural network development tool, called Intelligent Vision System Lab (IVS)-Caffe. IVS-Caffe can simulate the hardware behavior of convolution neural network inference calculation. It can quantize weights, input, and output features of convolutional neural network (CNN) and simulate the behavior of multipliers and accumulators calculation to achieve the bit-accurate result. Furthermore, it can test the accuracy of the chosen CNN hardware accelerator. Besides, this article proposes an algorithm to solve the deviation of gradient backpropagation in the bit-accurate quantized multipliers and accumulators. This allows the training of a bit-accurate model and further increases the accuracy of the CNN model at user-designed bit width. The proposed tool takes Faster region based CNN (R-CNN) + Matthew D. Zeiler and Rob Fergus (ZF)-Net, Single Shot MultiBox Detector (SSD) + VGG, SSD + MobileNet, and Tiny you only look once (YOLO) v2 as the experimental models. These models include both one-stage object detection and two-stage object detection models, and base networks include the convolution layer, the fully connected layer, and the modern advanced layers, such as the inception module and depthwise separable convolution. In these experiments, direct quantization of layer-I/O fixed-point models to bit-accurate models will have a 2% mean average precision (mAP) drop of accuracy in the constraint that all layers' accumulators and multipliers are quantized to less or equal to 14 and 12 bit, respectively. After retraining of these quantized models with the proposed IVS-Caffe, we can achieve less than 1% mAP drop in accuracy in the constraint that all layers' accumulators and multipliers are quantized to less or equal to 14 and 11 bit, respectively. With the proposed IVS-Caffe, we can analyze the accuracy of the target model when it is running at hardware accelerators with different bit widths, which is beneficial to fine-tune the target model or customize the hardware accelerators with lower power consumption. Code is available at https://github.com/apple35932003/IVS-Caffe.
AB - This article proposes a hardware-oriented neural network development tool, called Intelligent Vision System Lab (IVS)-Caffe. IVS-Caffe can simulate the hardware behavior of convolution neural network inference calculation. It can quantize weights, input, and output features of convolutional neural network (CNN) and simulate the behavior of multipliers and accumulators calculation to achieve the bit-accurate result. Furthermore, it can test the accuracy of the chosen CNN hardware accelerator. Besides, this article proposes an algorithm to solve the deviation of gradient backpropagation in the bit-accurate quantized multipliers and accumulators. This allows the training of a bit-accurate model and further increases the accuracy of the CNN model at user-designed bit width. The proposed tool takes Faster region based CNN (R-CNN) + Matthew D. Zeiler and Rob Fergus (ZF)-Net, Single Shot MultiBox Detector (SSD) + VGG, SSD + MobileNet, and Tiny you only look once (YOLO) v2 as the experimental models. These models include both one-stage object detection and two-stage object detection models, and base networks include the convolution layer, the fully connected layer, and the modern advanced layers, such as the inception module and depthwise separable convolution. In these experiments, direct quantization of layer-I/O fixed-point models to bit-accurate models will have a 2% mean average precision (mAP) drop of accuracy in the constraint that all layers' accumulators and multipliers are quantized to less or equal to 14 and 12 bit, respectively. After retraining of these quantized models with the proposed IVS-Caffe, we can achieve less than 1% mAP drop in accuracy in the constraint that all layers' accumulators and multipliers are quantized to less or equal to 14 and 11 bit, respectively. With the proposed IVS-Caffe, we can analyze the accuracy of the target model when it is running at hardware accelerators with different bit widths, which is beneficial to fine-tune the target model or customize the hardware accelerators with lower power consumption. Code is available at https://github.com/apple35932003/IVS-Caffe.
KW - Bit-accurate model quantization
KW - convolutional neural network
KW - deep learning model quantization
KW - low-power inference in edge device
UR - http://www.scopus.com/inward/record.url?scp=85111605420&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2021.3072145
DO - 10.1109/TNNLS.2021.3072145
M3 - Article
C2 - 34310321
AN - SCOPUS:85111605420
SN - 2162-237X
VL - 33
SP - 5978
EP - 5992
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 10
ER -