TY - GEN
T1 - Cascade Meta-RCNN for Few-shot Object Detection
AU - Li, Shuting
AU - Jiang, Qian
AU - Jin, Xin
AU - Liu, Nanqing
AU - Chen, Shiyu
AU - Lee, Shin Jye
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Data annotation is often a labor-intensive and time-consuming task, and in some cases, it is even challenging to collect certain data. Few-shot object detection (FSOD) addresses this challenge by recognizing objects with very few examples. Meta-learning-based FSOD usually trains models on a limited set of samples, enabling them to learn and generalize to new samples. Currently, meta-learning is widely utilized in the field of two-stage object detection, which aggregates query features and support features to obtain the final classification scores. Hence, there is a higher demand for accurate regions of interest (RoI). However, these methods only consider training samples with a single Intersection over Union (IoU) threshold. In this paper, we incorporate cascade structure into the existing Meta-RCNN model named Cascade Meta-RCNN. Specifically, the query feature is aggregated with prototypes of the support set. Then, the aggregated features are sequentially input into different RoI-Heads, which are trained with progressively increasing IoU thresholds. This method compels the network to generate more accurate query RoI features for matching with support prototypes. Additionally, we integrated the Channel and Spatial Attention (CSA) module into the model's backbone, enhancing the network's discriminative ability and further boosting its performance. To validate the effectiveness of our approach, we conducted a series of experiments on the PASCAL VOC dataset. The results demonstrate that our method outperforms the current state-of-the-art methods.
AB - Data annotation is often a labor-intensive and time-consuming task, and in some cases, it is even challenging to collect certain data. Few-shot object detection (FSOD) addresses this challenge by recognizing objects with very few examples. Meta-learning-based FSOD usually trains models on a limited set of samples, enabling them to learn and generalize to new samples. Currently, meta-learning is widely utilized in the field of two-stage object detection, which aggregates query features and support features to obtain the final classification scores. Hence, there is a higher demand for accurate regions of interest (RoI). However, these methods only consider training samples with a single Intersection over Union (IoU) threshold. In this paper, we incorporate cascade structure into the existing Meta-RCNN model named Cascade Meta-RCNN. Specifically, the query feature is aggregated with prototypes of the support set. Then, the aggregated features are sequentially input into different RoI-Heads, which are trained with progressively increasing IoU thresholds. This method compels the network to generate more accurate query RoI features for matching with support prototypes. Additionally, we integrated the Channel and Spatial Attention (CSA) module into the model's backbone, enhancing the network's discriminative ability and further boosting its performance. To validate the effectiveness of our approach, we conducted a series of experiments on the PASCAL VOC dataset. The results demonstrate that our method outperforms the current state-of-the-art methods.
KW - attention mechanism
KW - cascade network
KW - few-shot learning
KW - meta-learning
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85191321012&partnerID=8YFLogxK
U2 - 10.1109/ISPA-BDCloud-SocialCom-SustainCom59178.2023.00093
DO - 10.1109/ISPA-BDCloud-SocialCom-SustainCom59178.2023.00093
M3 - Conference contribution
AN - SCOPUS:85191321012
T3 - Proceedings - 2023 IEEE International Conference on Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking, ISPA/BDCloud/SocialCom/SustainCom 2023
SP - 466
EP - 473
BT - Proceedings - 2023 IEEE International Conference on Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking, ISPA/BDCloud/SocialCom/SustainCom 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 21st IEEE International Symposium on Parallel and Distributed Processing with Applications, 13th IEEE International Conference on Big Data and Cloud Computing, 16th IEEE International Conference on Social Computing and Networking and 13th International Conference on Sustainable Computing and Communications, ISPA/BDCloud/SocialCom/SustainCom 2023
Y2 - 21 December 2023 through 24 December 2023
ER -