TY - GEN
T1 - A Skeleton-based View-Invariant Framework for Human Fall Detection in an Elevator
AU - Ali, Rashid
AU - Hutomo, Ivan Surya
AU - Van, Lan Da
AU - Tseng, Yu Chee
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - This paper considers the emergency behavior detection problem inside an elevator. As elevators come in different shapes and emergency behavior data are scarce, we propose a skeleton-based view-invariant framework to tackle the camera view angle variation issue and the data collection issue. The proposed emergency fall detection model only needs to be trained for a target camera, which is deployed in an elevator at a manufacture's lab, from which a large amount of training data can be collected. The deployment of a source camera, which is in a customer-side elevator, hence can be customized and almost no training effort is needed. Our framework works in four stages. First, a 2D RGB input image is taken from the source camera and a 2D human skeleton is obtained by 2D pose estimation (AlphaPose). Second, the 2D skeleton is converted to a 3D human skeleton by 3D pose estimation (3D pose baseline). Third, a pre-trained rotation-translation (RT) transform (Procrustes analysis (PA)) aligns the 3D pose representations to the target camera view. Finally, a dual 3D pose baseline deep neural networks (D3PBDNN) model for human fall detection is proposed to perform the recognition task. We gather a human fall detection dataset inside different elevators from various view angles and validate our proposal. Experimental results successfully attain almost equivalent accuracy to that of a source camera-trained model.
AB - This paper considers the emergency behavior detection problem inside an elevator. As elevators come in different shapes and emergency behavior data are scarce, we propose a skeleton-based view-invariant framework to tackle the camera view angle variation issue and the data collection issue. The proposed emergency fall detection model only needs to be trained for a target camera, which is deployed in an elevator at a manufacture's lab, from which a large amount of training data can be collected. The deployment of a source camera, which is in a customer-side elevator, hence can be customized and almost no training effort is needed. Our framework works in four stages. First, a 2D RGB input image is taken from the source camera and a 2D human skeleton is obtained by 2D pose estimation (AlphaPose). Second, the 2D skeleton is converted to a 3D human skeleton by 3D pose estimation (3D pose baseline). Third, a pre-trained rotation-translation (RT) transform (Procrustes analysis (PA)) aligns the 3D pose representations to the target camera view. Finally, a dual 3D pose baseline deep neural networks (D3PBDNN) model for human fall detection is proposed to perform the recognition task. We gather a human fall detection dataset inside different elevators from various view angles and validate our proposal. Experimental results successfully attain almost equivalent accuracy to that of a source camera-trained model.
KW - 2D/3D pose estimation
KW - deep neural network
KW - fall detection
KW - Procrustes analysis
KW - skeleton
KW - view-invariant
UR - http://www.scopus.com/inward/record.url?scp=85146348830&partnerID=8YFLogxK
U2 - 10.1109/ICIT48603.2022.10002823
DO - 10.1109/ICIT48603.2022.10002823
M3 - Conference contribution
AN - SCOPUS:85146348830
T3 - Proceedings of the IEEE International Conference on Industrial Technology
BT - 2022 IEEE International Conference on Industrial Technology, ICIT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Industrial Technology, ICIT 2022
Y2 - 22 August 2022 through 25 August 2022
ER -