Human-Object Interaction Detection: An Overview

Jia Wang, Hong Han Shuai, Yung Hui Li, Wen Huang Cheng

Research output: Contribution to specialist publicationArticle


This paper systematically summarizes and discusses recent research on image-based human-object interaction (HOI) detection, which aims to detect human-object pairs and recognize the interactive behaviors between humans and objects in an image. It has plenty of applications and can serve as the basis to assist higher-level tasks of visual understanding. We introduce existing methods by categorizing them into two main groups based on the model structure: one-stage and two-stage approaches. We further divide one-stage methods into point-based, region-based, and query-based methods. Similarly, the two-stage methods are divided into HOI detection with multi-stream modeling, HOI detection with human parts and pose, HOI detection with compositional learning, HOI detection with graph-based modeling, and HOI detection with query-based modeling. According to this taxonomy, we also summarize and analyze the core ideas behind each strategy. Then, we present the details of the experimental protocols, evaluation metrics, datasets, and the evaluation results of the most recent representative methods. Finally, we discuss the HOI detection task's main open challenges and future trends.

Original languageEnglish
Number of pages14
Specialist publicationIEEE Consumer Electronics Magazine
StateAccepted/In press - 2023


  • Affordances
  • Cognition
  • Consumer electronics
  • Convolutional neural networks
  • Feature extraction
  • Task analysis
  • Visualization


Dive into the research topics of 'Human-Object Interaction Detection: An Overview'. Together they form a unique fingerprint.

Cite this