TY - GEN
T1 - Toward Fast Platform-Aware Neural Architecture Search for FPGA-Accelerated Edge AI Applications
AU - Liang, Yi Chuan
AU - Liao, Ying Chiao
AU - Lin, Chen Ching
AU - Hung, Shih Hao
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/10/13
Y1 - 2020/10/13
N2 - Neural Architecture Search (NAS) is a technique for finding suitable neural network architecture models for given applications. Previously, such search methods are usually based on reinforcement learning, with a recurrent neural network to generate neural network models. However, most NAS methods aim to find a set of candidates with best cost-performance ratios, e.g. high accuracy and low computing time, based on rough estimates derived from the workload generically. As today's deep learning chips accelerate neural network operations with a variety of hardware tricks such as vectors and low-precision data formats, the estimated metrics derived from generic computing operations such as float-point operations (FLOPS) would be very different from the actual latency, throughput, power consumption, etc., which are highly sensitive to the hardware design and even the software optimization in edge AI applications. Thus, instead of taking a long time to pick and train so called good candidates repeatedly based on unreliable estimates, we propose a NAS framework which accelerates the search process by including the actual performance measurements in the search process. The inclusion of actual measurements enables the proposed NAS framework to find candidates based on correct information and reduce the possibility of selecting wrong candidates and wasting search time on wrong candidates. To illustrate the effectiveness of our framework, we prototyped the framework to work with Intel OpenVINO and Field Programmable Gate Arrays (FPGA) to meet the accuracy and latency required by the user. The framework takes the dataset, accuracy and latency requirements from the user and automatically search for candidates to meet the requirements. Case studies and experimental results are presented in this paper to evaluate the effectiveness of our framework for Edge AI applications in real-time image classification.
AB - Neural Architecture Search (NAS) is a technique for finding suitable neural network architecture models for given applications. Previously, such search methods are usually based on reinforcement learning, with a recurrent neural network to generate neural network models. However, most NAS methods aim to find a set of candidates with best cost-performance ratios, e.g. high accuracy and low computing time, based on rough estimates derived from the workload generically. As today's deep learning chips accelerate neural network operations with a variety of hardware tricks such as vectors and low-precision data formats, the estimated metrics derived from generic computing operations such as float-point operations (FLOPS) would be very different from the actual latency, throughput, power consumption, etc., which are highly sensitive to the hardware design and even the software optimization in edge AI applications. Thus, instead of taking a long time to pick and train so called good candidates repeatedly based on unreliable estimates, we propose a NAS framework which accelerates the search process by including the actual performance measurements in the search process. The inclusion of actual measurements enables the proposed NAS framework to find candidates based on correct information and reduce the possibility of selecting wrong candidates and wasting search time on wrong candidates. To illustrate the effectiveness of our framework, we prototyped the framework to work with Intel OpenVINO and Field Programmable Gate Arrays (FPGA) to meet the accuracy and latency required by the user. The framework takes the dataset, accuracy and latency requirements from the user and automatically search for candidates to meet the requirements. Case studies and experimental results are presented in this paper to evaluate the effectiveness of our framework for Edge AI applications in real-time image classification.
KW - AI
KW - Deep Learning
KW - Edge Computing
KW - FPGA. OpenVINO
KW - GPU
KW - Neural Architecture Search
KW - Performance Evaluation
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85097365651&partnerID=8YFLogxK
U2 - 10.1145/3400286.3418240
DO - 10.1145/3400286.3418240
M3 - Conference contribution
AN - SCOPUS:85097365651
T3 - ACM International Conference Proceeding Series
SP - 219
EP - 225
BT - Proceedings of the 2020 Research in Adaptive and Convergent Systems, RACS 2020
PB - Association for Computing Machinery
T2 - 2020 Research in Adaptive and Convergent Systems, RACS 2020
Y2 - 13 October 2020 through 16 October 2020
ER -