Attention-based Video Virtual Try-On

Wen Jiin Tsai, Yi Cheng Tien

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

This paper presents a video virtual try-on model which is based on appearance flow warping and is parsing-free. In this model, we utilized attention methods from Transformer [15] and proposed three attention-based modules: a Person-Cloth Transformer, a Self-Attention Generator, and a Cloth Refinement Transformer. The Person-Cloth Transformer enables clothing features to refer to person information, which is beneficial for style vector calculation and also improves the style warping process to estimate better appearance flows. The Self-Attention Generator utilizes a self-attention mechanism at the deepest feature layer, which enables the feature map to learn global context from all the other pixels, helping it synthesize more realistic results. The Cloth Refinement Transformer utilizes two cross-attention modules: one enables the current warped clothes to refer to previously warped clothes to ensure it is temporally consistent, and the other enables the current warped clothes to refer to person information to ensure it is spatially aligned. Our ablation study shows that each proposed module contributes to the improvement of the results. Experiment results show that our model can generate realistic try-on videos with high quality and perform better than existing methods.

Original languageEnglish
Title of host publicationICMR 2023 - Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages209-216
Number of pages8
ISBN (Electronic)9798400701788
DOIs
StatePublished - 12 Jun 2023
Event2023 ACM International Conference on Multimedia Retrieval, ICMR 2023 - Thessaloniki, Greece
Duration: 12 Jun 202315 Jun 2023

Publication series

NameICMR 2023 - Proceedings of the 2023 ACM International Conference on Multimedia Retrieval

Conference

Conference2023 ACM International Conference on Multimedia Retrieval, ICMR 2023
Country/TerritoryGreece
CityThessaloniki
Period12/06/2315/06/23

Keywords

  • attention
  • parsing free
  • Virtual try-on

Fingerprint

Dive into the research topics of 'Attention-based Video Virtual Try-On'. Together they form a unique fingerprint.

Cite this