跳到主要导航 跳到搜索 跳到主要内容

Environment sound classification using a two-stream CNN based on decision-level fusion

  • Yu Su
  • , Ke Zhang
  • , Jingyu Wang
  • , Kurosh Madani
  • Northwestern Polytechnical University Xian
  • Université Paris-Est Créteil

科研成果: 期刊稿件文章同行评审

182 引用 (Scopus)

摘要

With the popularity of using deep learning-based models in various categorization problems and their proven robustness compared to conventional methods, a growing number of researchers have exploited such methods in environment sound classification tasks in recent years. However, the performances of existing models use auditory features like log-mel spectrogram (LM) and mel frequency cepstral coefficient (MFCC), or raw waveform to train deep neural networks for environment sound classification (ESC) are unsatisfactory. In this paper, we first propose two combined features to give a more comprehensive representation of environment sounds Then, a fourfour-layer convolutional neural network (CNN) is presented to improve the performance of ESC with the proposed aggregated features. Finally, the CNN trained with different features are fused using the Dempster–Shafer evidence theory to compose TSCNN-DS model. The experiment results indicate that our combined features with the four-layer CNN are appropriate for environment sound taxonomic problems and dramatically outperform other conventional methods. The proposed TSCNN-DS model achieves a classification accuracy of 97.2%, which is the highest taxonomic accuracy on UrbanSound8K datasets compared to existing models.

源语言英语
文章编号1733
期刊Sensors
19
7
DOI
出版状态已出版 - 1 4月 2019

指纹

探究 'Environment sound classification using a two-stream CNN based on decision-level fusion' 的科研主题。它们共同构成独一无二的指纹。

引用此