Robust acoustic event recognition using AVMD-PWVD time-frequency image

Yanhua Zhang, Ke Zhang, Jingyu Wang, Yu Su

科研成果: 期刊稿件文章同行评审

6 引用 (Scopus)

摘要

Environmental sound feature extraction and classification are important signal analysis tools in many applications, such as audio surveillance, multimedia retrieval, and auditory source identification. However, the non-stationarity and discontinuity of environmental signals make quantification and classification a formidable challenge. Hence, researchers proposed to use the time-frequency image representation to quantify these non-stationarity, resulting in higher classification accuracy. In this paper, a time-frequency representation method is proposed to represent environmental sound signals. Our approach consists of three stages: Firstly, we propose an adaptive variational modal decomposition (AVMD) based on central angular frequency difference to decompose environmental sounds into a series of modes. Secondly, we use the pseudo Wigner-Vile distribution (PWVD) to accurately obtain the instantaneous frequency characteristics of mode signals. Thirdly, time-frequency images of sound signals are obtained by combining the mode signals with PWVD. Finally, we put the time-frequency image into a convolutional neural network (CNN) for classification. The method is tested on the Real World Computing Partnership (RWCP) Sound Scene Database of 50 classes in mismatched conditions. Results show that our method is robust to noise and achieves the best average recognition accuracy compared with several state-of-art methods under clean and various noisy conditions.

源语言英语
文章编号107970
期刊Applied Acoustics
178
DOI
出版状态已出版 - 7月 2021

指纹

探究 'Robust acoustic event recognition using AVMD-PWVD time-frequency image' 的科研主题。它们共同构成独一无二的指纹。

引用此