A multi-scale fully convolutional network for singing melody extraction

Ping Gao, Cheng You You, Tai-Shih Chi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

The melody extraction can be considered as a se-quence-to-sequence task or a classification task. Many recent models based on semantic segmentation have been proven very effective in melody extraction. In this paper, we built up a fully convolutional network (FCN) for melody extraction from polyphonic music. Inspired by the state-of-the-art architecture of the semantic segmentation, we constructed the encoder in a dense way and designed the decoder accordingly for audio processing. The combined frequency and periodicity (CFP) representation, which contains spectral and cepstral information, was adopted as the input feature of the proposed model. We conducted performance comparison between the proposed model and several methods on various datasets. Experimental results show the proposed model achieves state-of-the-art performance with less computation and fewer parameters.

Original languageEnglish
Title of host publication2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1288-1293
Number of pages6
ISBN (Electronic)9781728132488
DOIs
StatePublished - Nov 2019
Event2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 - Lanzhou, China
Duration: 18 Nov 201921 Nov 2019

Publication series

Name2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019

Conference

Conference2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
Country/TerritoryChina
CityLanzhou
Period18/11/1921/11/19

Fingerprint

Dive into the research topics of 'A multi-scale fully convolutional network for singing melody extraction'. Together they form a unique fingerprint.

Cite this