Anime Character Recognition using Intermediate Features Aggregation

Edwin Arkel Rios, Min Chun Hu, Bo Cheng Lai

研究成果: Conference contribution同行評審

4 引文 斯高帕斯(Scopus)

摘要

In this work we study the problem of anime character recognition. Anime, refers to animation produced within Japan and work derived or inspired from it. We propose a novel Intermediate Features Aggregation classification head, which helps smooth the optimization landscape of Vision Transformers (ViTs) by adding skip connections between intermediate layers and the classification head, thereby improving relative classification accuracy by up to 28%. The proposed model, named as Animesion, is the first end-to-end framework for large-scale anime character recognition. We conduct extensive experiments using a variety of classification models, including CNNs and self-attention based ViTs. We also adapt its multimodal variation Vision-Language Transformer (ViLT), to incorporate external tag data for classification, without additional multimodal pre-training. Through our results we obtain new insights into the effects of how hyperparameters such as input sequence length, mini-batch size, and variations on the architecture, affect the transfer learning performance of Vi(L)Ts.

原文English
主出版物標題IEEE International Symposium on Circuits and Systems, ISCAS 2022
發行者Institute of Electrical and Electronics Engineers Inc.
頁面424-428
頁數5
ISBN(電子)9781665484855
DOIs
出版狀態Published - 2022
事件2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022 - Austin, 美國
持續時間: 27 5月 20221 6月 2022

出版系列

名字Proceedings - IEEE International Symposium on Circuits and Systems
2022-May
ISSN(列印)0271-4310

Conference

Conference2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022
國家/地區美國
城市Austin
期間27/05/221/06/22

指紋

深入研究「Anime Character Recognition using Intermediate Features Aggregation」主題。共同形成了獨特的指紋。

引用此