Multi-fusion feature pyramid for real-time hand detection

Chuan Wang Chang, Santanu Santra, Jun Wei Hsieh*, Pirdiansyah Hendri, Chi Fang Lin

*此作品的通信作者

研究成果: Article同行評審

摘要

Real-time HI (Human Interface) systems need accurate and efficient hand detection models to meet the limited resources in budget, dimension, memory, computing, and electric power. The detection task is also important for other applications such as homecare systems, fine-grained action recognition, movie interpretation, and even for understanding dance gestures. In recent years, object detection has become a less challenging task with the latest deep CNN-based state-of-the-art models, i.e., RCNN, SSD, and YOLO. However, these models cannot achieve desired efficiency and accuracy on HI-based embedded devices due to their complex time-consuming architecture. Another critical issue in hand detection is that small hands (<30 × 30 pixels) are still challenging for all the above methods. We proposed a shallow model named Multi-fusion Feature Pyramid for real-time hand detection to deal with the above problems. Experimental results on the Oxford hand dataset combined with the skin dataset show that the proposed method outperforms other SoTA methods in terms of accuracy, efficiency, and real-time speed. The COCO dataset is also used to compare with other state-of-the-art method and shows the highest efficiency and accuracy with the proposed CFPN model. Thus we conclude that the proposed model is useful for real-life small hand detection on embedded devices.

原文English
期刊Multimedia Tools and Applications
DOIs
出版狀態Accepted/In press - 2022

指紋

深入研究「Multi-fusion feature pyramid for real-time hand detection」主題。共同形成了獨特的指紋。

引用此