IIOF: Intra- and Inter-feature orthogonal fusion of local and global features for music emotion recognition

Pei Chun Chang, Yong Sheng Chen, Chang Hsing Lee*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


In this paper, we propose intra- and inter-feature orthogonal fusion (IIOF) of local and global features obtained from MS-SincResNet or MS-SSincResNet (a variant of MS-SincResNet) for music emotion recognition (MER). Given a raw waveform of music signal, MS-SincResNet/MS-SSincResNet is first used to learn several 2D representations having different receptive fields and obtain embeddings with time-frequency information from different layers. Then, local and global features are extracted from these embeddings. IIOF consisting of intra-feature OF and inter-feature OF is further employed to integrate both local and global features to obtain a discriminative descriptor for MER. The intra-feature OF is used to enhance the diversity of the global feature, and the inter-feature OF is utilized to reduce redundancies and produce complementary information between local and global features. The experimental results have demonstrated that the representation discriminability can be enhanced by IIOF considering the feature orthogonality. Furthermore, extensive experimental results have shown that the proposed method outperforms other state-of-the-art methods in terms of regression and classification tasks on the well-known MER datasets, including the DEAM dataset and the PMEmo dataset. The codes are available at https://github.com/PeiChunChang/MS-SSincResNet_with_IIOF.

Original languageEnglish
Article number110200
JournalPattern Recognition
StatePublished - Apr 2024


  • Music emotion recognition
  • Orthogonal fusion
  • ResNet
  • SincNet


Dive into the research topics of 'IIOF: Intra- and Inter-feature orthogonal fusion of local and global features for music emotion recognition'. Together they form a unique fingerprint.

Cite this