TY - GEN
T1 - Deeppear
T2 - 25th International Conference on Pattern Recognition, ICPR 2020
AU - Jhuang, You Ying
AU - Tsai, Wen Jiin
N1 - Publisher Copyright:
© 2020 IEEE
PY - 2020
Y1 - 2020
N2 - Human action recognition has been a popular issue recently because it can be applied in many applications such as intelligent surveillance systems, human-robot interaction, and autonomous vehicle control. Human action recognition using RGB video is a challenging task because the learning of actions is easily affected by the cluttered background. To cope with this problem, the proposed method estimates 3D human poses first which can help remove the cluttered background and focus on the human body. In addition to the human poses, the proposed method also utilizes appearance features nearby the predicted joints to make our action prediction context-aware. Instead of using 3D convolutional neural networks as many action recognition approaches did, the proposed method uses a two-stream architecture that aggregates the results from skeleton-based and appearance-based approaches to do action recognition. Experimental results show that the proposed method achieved state-of-the-art performance on NTU RGB+D which is a large-scale dataset for human action recognition.
AB - Human action recognition has been a popular issue recently because it can be applied in many applications such as intelligent surveillance systems, human-robot interaction, and autonomous vehicle control. Human action recognition using RGB video is a challenging task because the learning of actions is easily affected by the cluttered background. To cope with this problem, the proposed method estimates 3D human poses first which can help remove the cluttered background and focus on the human body. In addition to the human poses, the proposed method also utilizes appearance features nearby the predicted joints to make our action prediction context-aware. Instead of using 3D convolutional neural networks as many action recognition approaches did, the proposed method uses a two-stream architecture that aggregates the results from skeleton-based and appearance-based approaches to do action recognition. Experimental results show that the proposed method achieved state-of-the-art performance on NTU RGB+D which is a large-scale dataset for human action recognition.
KW - 3D human pose
KW - Human action recognition
UR - http://www.scopus.com/inward/record.url?scp=85110462577&partnerID=8YFLogxK
U2 - 10.1109/ICPR48806.2021.9413011
DO - 10.1109/ICPR48806.2021.9413011
M3 - Conference contribution
AN - SCOPUS:85110462577
T3 - Proceedings - International Conference on Pattern Recognition
SP - 7119
EP - 7125
BT - Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 10 January 2021 through 15 January 2021
ER -