TY - GEN
T1 - Human-Object Interaction Detection
T2 - An Overview
AU - Wang, Jia
AU - Shuai, Hong Han
AU - Li, Yung Hui
AU - Cheng, Wen Huang
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024
Y1 - 2024
N2 - This article systematically summarizes and discusses recent research on image-based human object interaction (HOI) detection, which aims to detect human object pairs and recognize the interactive behaviors between humans and objects in an image. It has plenty of applications and can serve as the basis to assist higher level tasks of visual understanding. We introduce existing methods by categorizing them into two main groups based on the model structure: one-stage and two-stage approaches. We further divide one-stage methods into point-based, region-based, and query-based methods. Similarly, the two-stage methods are divided into HOI detection with multistream modeling, HOI detection with human parts and pose, HOI detection with compositional learning, HOI detection with graph-based modeling, and HOI detection with query-based modeling. According to this taxonomy, we also summarize and analyze the core ideas behind each strategy. Then, we present the details of the experimental protocols, evaluation metrics, datasets, and the evaluation results of the most recent representative methods. Finally, we discuss the main open challenges and future trends in the HOI detection task.
AB - This article systematically summarizes and discusses recent research on image-based human object interaction (HOI) detection, which aims to detect human object pairs and recognize the interactive behaviors between humans and objects in an image. It has plenty of applications and can serve as the basis to assist higher level tasks of visual understanding. We introduce existing methods by categorizing them into two main groups based on the model structure: one-stage and two-stage approaches. We further divide one-stage methods into point-based, region-based, and query-based methods. Similarly, the two-stage methods are divided into HOI detection with multistream modeling, HOI detection with human parts and pose, HOI detection with compositional learning, HOI detection with graph-based modeling, and HOI detection with query-based modeling. According to this taxonomy, we also summarize and analyze the core ideas behind each strategy. Then, we present the details of the experimental protocols, evaluation metrics, datasets, and the evaluation results of the most recent representative methods. Finally, we discuss the main open challenges and future trends in the HOI detection task.
UR - http://www.scopus.com/inward/record.url?scp=85181570207&partnerID=8YFLogxK
U2 - 10.1109/MCE.2023.3343919
DO - 10.1109/MCE.2023.3343919
M3 - Article
AN - SCOPUS:85181570207
SN - 2162-2248
VL - 13
SP - 56
EP - 72
JO - IEEE Consumer Electronics Magazine
JF - IEEE Consumer Electronics Magazine
ER -