TY - GEN
T1 - Hierarchically Aggregated Identification Transformer Network for Camouflaged Object Detection
AU - Phung, Thanh Hai
AU - Chen, Hung Jen
AU - Shuai, Hong Han
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Camouflaged object detection (COD) targets the segmentation of objects hidden in intricate environments, a task complicated by the pronounced similarities between objects and their surroundings. The diverse appearances of camouflaged objects, such as different view angles, partial visibilities, and ambiguous forms, further exacerbate this challenge. To address these issues, we introduce the Hierarchically Aggregated Identification Transformer Network (HAIT-Net). HAITNet harnesses local and global features to refine object localization by employing multi-scale transformer features unified through the Feature Cascaded Fusion Module (FCFM). To tackle ambiguity from indistinct textures, we present the Graph-based Low-level Feature Enhancement Module (GLFEM) and Graph-based Feature Aggregation Module (GFAM). GLFEM enhances texture representation in ambiguous areas, while GFAM reduces false positives and refines prediction maps by discerning contextual relationships. Experimental results on three widely used datasets demonstrate that the proposed HAITNet outperforms the state-of-the-art approaches. Our code is available at https://github.com/underlmao/HAITNet.
AB - Camouflaged object detection (COD) targets the segmentation of objects hidden in intricate environments, a task complicated by the pronounced similarities between objects and their surroundings. The diverse appearances of camouflaged objects, such as different view angles, partial visibilities, and ambiguous forms, further exacerbate this challenge. To address these issues, we introduce the Hierarchically Aggregated Identification Transformer Network (HAIT-Net). HAITNet harnesses local and global features to refine object localization by employing multi-scale transformer features unified through the Feature Cascaded Fusion Module (FCFM). To tackle ambiguity from indistinct textures, we present the Graph-based Low-level Feature Enhancement Module (GLFEM) and Graph-based Feature Aggregation Module (GFAM). GLFEM enhances texture representation in ambiguous areas, while GFAM reduces false positives and refines prediction maps by discerning contextual relationships. Experimental results on three widely used datasets demonstrate that the proposed HAITNet outperforms the state-of-the-art approaches. Our code is available at https://github.com/underlmao/HAITNet.
KW - Camouflaged Object Detection
KW - Global-to-Local Interaction
UR - http://www.scopus.com/inward/record.url?scp=85206582161&partnerID=8YFLogxK
U2 - 10.1109/ICME57554.2024.10687759
DO - 10.1109/ICME57554.2024.10687759
M3 - Conference contribution
AN - SCOPUS:85206582161
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2024 IEEE International Conference on Multimedia and Expo, ICME 2024
PB - IEEE Computer Society
T2 - 2024 IEEE International Conference on Multimedia and Expo, ICME 2024
Y2 - 15 July 2024 through 19 July 2024
ER -