The virtual try-on task is so attractive that it has drawn considerable attention in the field of computer vision. However, presenting the 3-D physical characteristic (e.g., pleat and shadow) based on a 2-D image is very challenging. Although there have been several previous studies on 2-D-based virtual try-on work, most: 1) required user-specified target poses that are not user-friendly and may not be the best for the target clothing and 2) failed to address some problematic cases, including facial details, clothing wrinkles, and body occlusions. To address these two challenges, in this article, we propose an innovative template-free try-on image synthesis (TF-TIS) network. The TF-TIS first synthesizes the target pose according to the user-specified in-shop clothing. Afterward, given an in-shop clothing image, a user image, and a synthesized pose, we propose a novel model for synthesizing a human try-on image with the target clothing in the best fitting pose. The qualitative and quantitative experiments both indicate that the proposed TF-TIS outperforms the state-of-the-art methods, especially for difficult cases.
|Journal||IEEE Transactions on Neural Networks and Learning Systems|
|State||Published - 26 Feb 2021|
- Cross-modal learning
- image synthesis
- pose transfer
- semantic-guided learning
- virtual try-on.