Bayesian asymmetric quantized neural networks

Jen Tzung Chien*, Su Ting Chang

*此作品的通信作者

研究成果: Article同行評審

7 引文 斯高帕斯(Scopus)

摘要

This paper develops a robust model compression for neural networks via parameter quantization. Traditionally, quantized neural networks (QNN) were constructed by binary or ternary weights where the weights were deterministic. This paper generalizes QNN in two directions. First, M-ary QNN is developed to adjust the balance between memory storage and model capacity. The representation values and the quantization partitions in M-ary quantization are mutually estimated to enhance the resolution of gradients in neural network training. A flexible quantization with asymmetric partitions is formulated. Second, the variational inference is incorporated to implement the Bayesian asymmetric QNN. The uncertainty of weights is faithfully represented to enhance the robustness of the trained model in presence of heterogeneous data. Importantly, the multiple spike-and-slab prior is proposed to represent the quantization levels in Bayesian asymmetric learning. M-ary quantization is then optimized by maximizing the evidence lower bound of classification network. An adaptive parameter space is built to implement Bayesian quantization and neural representation. The experiments on various image recognition tasks show that M-ary QNN achieves similar performance as the full-precision neural network (FPNN), but the memory cost and the test time are significantly reduced relative to FPNN. The merit of Bayesian M-ary QNN using multiple spike-and-slab prior is investigated.

原文English
文章編號109463
期刊Pattern Recognition
139
DOIs
出版狀態Published - 7月 2023

指紋

深入研究「Bayesian asymmetric quantized neural networks」主題。共同形成了獨特的指紋。

引用此