Software and hardware enhancement of convolutional neural networks on GPGPUs

An Ting Cheng*, Chun Yen Chen, Bo-Cheng Lai, Che Huai Lin


研究成果: Article同行評審

1 引文 斯高帕斯(Scopus)


Convolutional Neural Networks (CNNs) have gained attention in recent years for their ability to perform complex machine learning tasks with high accuracy and resilient to noise of inputs. The time-consuming convolution operations of CNNs pose great challenges to both software as well as hardware designers. To achieve superior performance, a design involves careful concerns between exposing the massive computation parallelism and exploiting data reuse in complex data accesses. Existing designs lack comprehensive analysis on design techniques and decisions. The analytical discussion and quantitative proof behind the design criterion, such as choosing proper dimensions to parallelize, are not well studied. This paper performs a series of qualitative and quantitative studies on both the programming techniques and their implications on the GPU architecture. The observations reveal comprehensive understanding on the correlation between the design techniques and the resulting performance. Based on the analyses, we pinpoint the two major performance bottlenecks of CNN on GPGPU: performing computation and loading data from global memory. Software and hardware enhancements are proposed in this paper to alleviate these issues. Experimental results on a cycle-accurate GPGPU simulator have demonstrated up to 4.4x performance enhancement when compared with the reference design.

頁(從 - 到)28-39
期刊Advances in Science, Technology and Engineering Systems
出版狀態Published - 2018


深入研究「Software and hardware enhancement of convolutional neural networks on GPGPUs」主題。共同形成了獨特的指紋。