TY - GEN
T1 - Multi-scale feature based salient environmental sound recognition for machine awareness
AU - Wang, Jingyu
AU - Zhang, Ke
AU - Madani, Kurash
AU - Sabourin, Christophe
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/9
Y1 - 2014/12/9
N2 - Auditory perception of surrounding environment is important to machine awareness. To provide artificial awareness ability for machines, a bio-inspired salient environmental sound detection and recognition method is proposed. The salient sounds are detected by using the auditory saliency map which based on heterogeneous saliency features from visual and acoustic domain. Spectral and temporal saliency features from both power spectral density (PSD) and mel-frequency cepstral coefficients (MFCC) as well as the visual saliency from log-scale spectrogram are applied to yield the final auditory saliency for salient sound detection. To improve the detection accuracy, short-term Shannon entropy (SSE) and a computational inhibition of return (IOR) model are initially proposed to verify the temporal saliency characteristic. The detected salient sounds are classified by using the features which based on the fuzzy vector of spectral energy distribution and MFCC. A two-level classification is presented based on the support vector machine (SVM) for recognition task. Experiments are carried out on the real environmental sound examples. The results show that, over 83% recognition accuracy can be achieved by using proposed fuzzy vector based features, and the overall accuracy of 94.65%
AB - Auditory perception of surrounding environment is important to machine awareness. To provide artificial awareness ability for machines, a bio-inspired salient environmental sound detection and recognition method is proposed. The salient sounds are detected by using the auditory saliency map which based on heterogeneous saliency features from visual and acoustic domain. Spectral and temporal saliency features from both power spectral density (PSD) and mel-frequency cepstral coefficients (MFCC) as well as the visual saliency from log-scale spectrogram are applied to yield the final auditory saliency for salient sound detection. To improve the detection accuracy, short-term Shannon entropy (SSE) and a computational inhibition of return (IOR) model are initially proposed to verify the temporal saliency characteristic. The detected salient sounds are classified by using the features which based on the fuzzy vector of spectral energy distribution and MFCC. A two-level classification is presented based on the support vector machine (SVM) for recognition task. Experiments are carried out on the real environmental sound examples. The results show that, over 83% recognition accuracy can be achieved by using proposed fuzzy vector based features, and the overall accuracy of 94.65%
KW - artificial awareness
KW - environment sound signal
KW - fuzzy vector
KW - heterogeneous information
KW - MFCC
KW - saliency feature fusion
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=84920527868&partnerID=8YFLogxK
U2 - 10.1109/ICAwST.2014.6981837
DO - 10.1109/ICAwST.2014.6981837
M3 - 会议稿件
AN - SCOPUS:84920527868
T3 - 2014 IEEE 6th International Conference on Awareness Science and Technology, iCAST 2014
BT - 2014 IEEE 6th International Conference on Awareness Science and Technology, iCAST 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE International Conference on Awareness Science and Technology, iCAST 2014
Y2 - 29 October 2014 through 31 October 2014
ER -