Multi-scale feature based salient environmental sound recognition for machine awareness

Jingyu Wang, Ke Zhang, Kurash Madani, Christophe Sabourin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Auditory perception of surrounding environment is important to machine awareness. To provide artificial awareness ability for machines, a bio-inspired salient environmental sound detection and recognition method is proposed. The salient sounds are detected by using the auditory saliency map which based on heterogeneous saliency features from visual and acoustic domain. Spectral and temporal saliency features from both power spectral density (PSD) and mel-frequency cepstral coefficients (MFCC) as well as the visual saliency from log-scale spectrogram are applied to yield the final auditory saliency for salient sound detection. To improve the detection accuracy, short-term Shannon entropy (SSE) and a computational inhibition of return (IOR) model are initially proposed to verify the temporal saliency characteristic. The detected salient sounds are classified by using the features which based on the fuzzy vector of spectral energy distribution and MFCC. A two-level classification is presented based on the support vector machine (SVM) for recognition task. Experiments are carried out on the real environmental sound examples. The results show that, over 83% recognition accuracy can be achieved by using proposed fuzzy vector based features, and the overall accuracy of 94.65%

Original languageEnglish
Title of host publication2014 IEEE 6th International Conference on Awareness Science and Technology, iCAST 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479973736
DOIs
StatePublished - 9 Dec 2014
Event6th IEEE International Conference on Awareness Science and Technology, iCAST 2014 - Paris, France
Duration: 29 Oct 201431 Oct 2014

Publication series

Name2014 IEEE 6th International Conference on Awareness Science and Technology, iCAST 2014

Conference

Conference6th IEEE International Conference on Awareness Science and Technology, iCAST 2014
Country/TerritoryFrance
CityParis
Period29/10/1431/10/14

Keywords

  • artificial awareness
  • environment sound signal
  • fuzzy vector
  • heterogeneous information
  • MFCC
  • saliency feature fusion
  • SVM

Fingerprint

Dive into the research topics of 'Multi-scale feature based salient environmental sound recognition for machine awareness'. Together they form a unique fingerprint.

Cite this