TY - GEN
T1 - Model Compression via Structural Pruning and Feature Distillation for Accurate Multi-Spectral Object Detection on Edge-Devices
AU - Poliakov, Egor
AU - Luu, Van Tin
AU - Tran, Vu Hoang
AU - Huang, Ching Chun
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Multi-spectral infrared object detection across different infrared wavelengths is a challenging task. Although some full-sized object detection models, such as YOLOv4 and ScaledYOLO, may achieve good infrared object detection, they are resource-demanding and unsuitable for real-time detection on edge devices. Tiny versions for object detection are proposed to meet the practical requirement, but they usually sacrifice model accuracy and generalization for efficiency. We propose an accurate and efficient object detector capable of performing real-time inference under the hardware constraints of an edge device by leveraging structural pruning, feature distillation, and neural architecture search (NAS). The experiments on FLIR and multi-spectral object detection datasets show that our model achieves comparable mAP to full-sized models while having 14x times fewer parameters and 3.5x times fewer FLOPs. Our model can perform infrared detection well across different infrared wavelengths. The optimal CSPNet configurations of our detection network selected by NAS show that the resulting architectures outperform the baseline.
AB - Multi-spectral infrared object detection across different infrared wavelengths is a challenging task. Although some full-sized object detection models, such as YOLOv4 and ScaledYOLO, may achieve good infrared object detection, they are resource-demanding and unsuitable for real-time detection on edge devices. Tiny versions for object detection are proposed to meet the practical requirement, but they usually sacrifice model accuracy and generalization for efficiency. We propose an accurate and efficient object detector capable of performing real-time inference under the hardware constraints of an edge device by leveraging structural pruning, feature distillation, and neural architecture search (NAS). The experiments on FLIR and multi-spectral object detection datasets show that our model achieves comparable mAP to full-sized models while having 14x times fewer parameters and 3.5x times fewer FLOPs. Our model can perform infrared detection well across different infrared wavelengths. The optimal CSPNet configurations of our detection network selected by NAS show that the resulting architectures outperform the baseline.
KW - Cross Stage Partial Network (CSPNet)
KW - infrared image
KW - model compression
KW - neural architecture search
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85137689544&partnerID=8YFLogxK
U2 - 10.1109/ICME52920.2022.9859994
DO - 10.1109/ICME52920.2022.9859994
M3 - Conference contribution
AN - SCOPUS:85137689544
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - ICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings
PB - IEEE Computer Society
T2 - 2022 IEEE International Conference on Multimedia and Expo, ICME 2022
Y2 - 18 July 2022 through 22 July 2022
ER -