TY - JOUR
T1 - 基于融合特征以及卷积神经网络的环境声音分类系统研究
AU - Zhang, Ke
AU - Su, Yu
AU - Wang, Jingyu
AU - Wang, Sanyu
AU - Zhang, Yanhua
N1 - Publisher Copyright:
© 2020 Journal of Northwestern Polytechnical University.
PY - 2020/2/1
Y1 - 2020/2/1
N2 - At present, the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features. Therefore, it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems. In this paper, we chose three sound features which based on two widely used filters: the Mel and Gammatone filter banks. Subsequently, the hybrid feature MGCC is presented. Finally, a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks. The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system. Among all the acoustic features, the MGCC feature achieves the best performance than other features. Finally, the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset. The results show that the proposed model has the best classification accuracy.
AB - At present, the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features. Therefore, it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems. In this paper, we chose three sound features which based on two widely used filters: the Mel and Gammatone filter banks. Subsequently, the hybrid feature MGCC is presented. Finally, a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks. The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system. Among all the acoustic features, the MGCC feature achieves the best performance than other features. Finally, the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset. The results show that the proposed model has the best classification accuracy.
KW - Convolutional neural network
KW - Environment sound
KW - Filter
KW - Hybrid feature
KW - Sound classification
UR - http://www.scopus.com/inward/record.url?scp=85081390748&partnerID=8YFLogxK
U2 - 10.1051/jnwpu/20203810162
DO - 10.1051/jnwpu/20203810162
M3 - 文章
AN - SCOPUS:85081390748
SN - 1000-2758
VL - 38
SP - 162
EP - 169
JO - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
JF - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
IS - 1
ER -