Multi-fusion feature pyramid for real-time hand detection

Chuan Wang Chang, Santanu Santra, Jun Wei Hsieh*, Pirdiansyah Hendri, Chi Fang Lin


研究成果: Article同行評審


Real-time HI (Human Interface) systems need accurate and efficient hand detection models to meet the limited resources in budget, dimension, memory, computing, and electric power. The detection task is also important for other applications such as homecare systems, fine-grained action recognition, movie interpretation, and even for understanding dance gestures. In recent years, object detection has become a less challenging task with the latest deep CNN-based state-of-the-art models, i.e., RCNN, SSD, and YOLO. However, these models cannot achieve desired efficiency and accuracy on HI-based embedded devices due to their complex time-consuming architecture. Another critical issue in hand detection is that small hands (<30 × 30 pixels) are still challenging for all the above methods. We proposed a shallow model named Multi-fusion Feature Pyramid for real-time hand detection to deal with the above problems. Experimental results on the Oxford hand dataset combined with the skin dataset show that the proposed method outperforms other SoTA methods in terms of accuracy, efficiency, and real-time speed. The COCO dataset is also used to compare with other state-of-the-art method and shows the highest efficiency and accuracy with the proposed CFPN model. Thus we conclude that the proposed model is useful for real-life small hand detection on embedded devices.

期刊Multimedia Tools and Applications
出版狀態Accepted/In press - 2022


深入研究「Multi-fusion feature pyramid for real-time hand detection」主題。共同形成了獨特的指紋。