TY - GEN
T1 - Timbre-enhanced Multi-modal Music Style Transfer with Domain Balance Loss
AU - Fan, Tsai Jyun
AU - Lu, Chien Yu
AU - Chiu, Wei Chen
AU - Su, Li
AU - Lee, Che Rung
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - Style transfer of the polyphonic music recordings has always been a challenging task due to the difficulty of learning representations for both domain invariant (i.e. content) and domain-variant (i.e. style) features of the music. Although there exists prior works which employ the Multi-modal Unsupervised Image-to-Image Translation (MUNIT) framework to perform the music style transfer in an unsupervised manner and successfully provide the promising results, the gap between the transferred music recordings and the real ones is still noticeable. In order to reduce such gap, we propose and experiment several techniques for improving the transferred results, including the domain balanced loss, up-sampling, content discriminator, recycle loss, and the data scaling. We conduct extensive experiments on the tasks of bilateral style transfer among four different genres, namely: piano solo, guitar solo, string quartet, and chiptune. In evaluation, an objective testing scheme is proposed to investigate the pros and cons of all our proposed techniques, while we also design a subjective testing method for making comparison among different approaches and show that our proposed method is able to provide superior performance with respect to the prior works.
AB - Style transfer of the polyphonic music recordings has always been a challenging task due to the difficulty of learning representations for both domain invariant (i.e. content) and domain-variant (i.e. style) features of the music. Although there exists prior works which employ the Multi-modal Unsupervised Image-to-Image Translation (MUNIT) framework to perform the music style transfer in an unsupervised manner and successfully provide the promising results, the gap between the transferred music recordings and the real ones is still noticeable. In order to reduce such gap, we propose and experiment several techniques for improving the transferred results, including the domain balanced loss, up-sampling, content discriminator, recycle loss, and the data scaling. We conduct extensive experiments on the tasks of bilateral style transfer among four different genres, namely: piano solo, guitar solo, string quartet, and chiptune. In evaluation, an objective testing scheme is proposed to investigate the pros and cons of all our proposed techniques, while we also design a subjective testing method for making comparison among different approaches and show that our proposed method is able to provide superior performance with respect to the prior works.
UR - http://www.scopus.com/inward/record.url?scp=85103828795&partnerID=8YFLogxK
U2 - 10.1109/TAAI51410.2020.00027
DO - 10.1109/TAAI51410.2020.00027
M3 - Conference contribution
AN - SCOPUS:85103828795
T3 - Proceedings - 25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020
SP - 102
EP - 107
BT - Proceedings - 25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020
Y2 - 3 December 2020 through 5 December 2020
ER -