StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

Hong Sheng Zheng, Chen Fong Hsu, Yu Yuan Liu, Tsung Tai Yeh

研究成果: Conference article同行評審

2 引文 斯高帕斯(Scopus)

摘要

With the emerging Tiny Machine Learning (TinyML) inference applications, there is a growing interest when deploying TinyML models on the low-power Microcontroller Unit (MCU).However, deploying TinyML models on MCUs reveals several challenges due to the MCU's resource constraints, such as small flash memory, tight SRAM memory budget, and slow CPU performance.Unlike typical layer-wise inference, patch-based inference reduces the peak usage of SRAM memory on MCUs by saving small patches rather than the entire tensor in the SRAM memory.However, the processing of patch-based inference tremendously increases the amount of MACs against the layer-wise method.Thus, this notoriously computational overhead makes patch-based inference undesirable on MCUs.This work designs StreamNet that employs the stream buffer to eliminate the redundant computation of patch-based inference.StreamNet uses 1D and 2D streaming processing and provides an parameter selection algorithm that automatically improve the performance of patch-based inference with minimal requirements on the MCU's SRAM memory space.In 10 TinyML models, StreamNet-2D achieves a geometric mean of 7.3X speedup and saves 81% of MACs over the state-of-the-art patch-based inference.

原文English
期刊Advances in Neural Information Processing Systems
36
出版狀態Published - 2023
事件37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, 美國
持續時間: 10 12月 202316 12月 2023

指紋

深入研究「StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller」主題。共同形成了獨特的指紋。

引用此