DNN Audio Classification Based on Extracted Spectral Attributes

Pei Chen Lo*, Chuan Yi Liu, Tsung Hsien Chou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Recent advances in multimedia systems provide remarkable audio-visual experiences to various fields including entertainment, education, communication, industrial design, etc. To facilitate the audio-visual experience, audio quality enhancement becomes important. However, methods and techniques for improving audio quality highly depend on such audio attributes like human voices, music of different genres, or audio of various programs. This study is devoted to the development of an effective method for real-time audio classification based on deep learning scheme. Three classes of interest include classical music, non-classical music and news. Subband-power distribution (SPD) is a one-dimensional feature based on the audio power in frequency domain, which effectively reflects the spectral attributes of various audio content and allows us to implement DNN (deep neural network) audio classifier in real time. This study develops different DNN models according to various input designs, original SPD of different frequency resolutions and SPD pre-processed by principal component analysis (PCA). Overall accuracy Acc and prediction accuracy of each class using confusion matrix (CFM) will be evaluated to compare the performance. According to our results, the DNN audio classifier implemented with the input SPD pre-processed by PCA not only achieves better performance but remarkably reduces the memory capacity and computational time.

Original languageEnglish
Title of host publicationProceedings - 2022 14th International Conference on Signal Processing Systems, ICSPS 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages259-262
Number of pages4
ISBN (Electronic)9798350336313
DOIs
StatePublished - 2022
Event14th International Conference on Signal Processing Systems, ICSPS 2022 - Virtual, Online, China
Duration: 18 Nov 202220 Nov 2022

Publication series

NameProceedings - 2022 14th International Conference on Signal Processing Systems, ICSPS 2022

Conference

Conference14th International Conference on Signal Processing Systems, ICSPS 2022
Country/TerritoryChina
CityVirtual, Online
Period18/11/2220/11/22

Keywords

  • Audio classification
  • Deep learning
  • Deep neural network (DNN)
  • Principal component analysis (PCA)
  • Real-time process
  • Subband-power distribution (SPD)

Fingerprint

Dive into the research topics of 'DNN Audio Classification Based on Extracted Spectral Attributes'. Together they form a unique fingerprint.

Cite this