TY - JOUR
T1 - Seal call recognition based on general regression neural network using Mel-frequency cepstrum coefficient features
AU - Yao, Qihai
AU - Wang, Yong
AU - Yang, Yixin
AU - Shi, Yang
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - In this paper, general regression neural network (GRNN) with the input feature of Mel-frequency cepstrum coefficient (MFCC) is employed to automatically recognize the calls of leopard, ross, and weddell seals with widely overlapping living areas. As a feedforward network, GRNN has only one network parameter, i.e., spread factor. The recognition performance can be greatly improved by determining the spread factor based on the cross-validation method. This paper selects the audio data of the calls of the above three kinds of seals and compares the recognition performance of three machine learning models for inputting MFCC features and low-frequency analyzer and recorder (LOFAR) spectrum. The results show that at the same signal-to-noise ratio (SNR), the recognition result of the MFCC feature is better than that of the LOFAR spectrum, which is verified by statistical histogram. Compared with other models, GRNN for inputting MFCC features has better recognition performance and can still achieve effective recognition at low SNRs. Specifically, the accuracy is 97.36%, 93.44%, 92.00% and 88.38% for cases with an infinite SNR and SNR of 10, 5 and 0 dB, respectively. In particular, GRNN has the least training and testing time. Therefore, all results show that the proposed method has excellent performance for the seal call recognition.
AB - In this paper, general regression neural network (GRNN) with the input feature of Mel-frequency cepstrum coefficient (MFCC) is employed to automatically recognize the calls of leopard, ross, and weddell seals with widely overlapping living areas. As a feedforward network, GRNN has only one network parameter, i.e., spread factor. The recognition performance can be greatly improved by determining the spread factor based on the cross-validation method. This paper selects the audio data of the calls of the above three kinds of seals and compares the recognition performance of three machine learning models for inputting MFCC features and low-frequency analyzer and recorder (LOFAR) spectrum. The results show that at the same signal-to-noise ratio (SNR), the recognition result of the MFCC feature is better than that of the LOFAR spectrum, which is verified by statistical histogram. Compared with other models, GRNN for inputting MFCC features has better recognition performance and can still achieve effective recognition at low SNRs. Specifically, the accuracy is 97.36%, 93.44%, 92.00% and 88.38% for cases with an infinite SNR and SNR of 10, 5 and 0 dB, respectively. In particular, GRNN has the least training and testing time. Therefore, all results show that the proposed method has excellent performance for the seal call recognition.
KW - GRNN
KW - Machine learning
KW - MFCC feature
KW - Seal call recognition
KW - Underwater acoustic signal
UR - http://www.scopus.com/inward/record.url?scp=85158953718&partnerID=8YFLogxK
U2 - 10.1186/s13634-023-01014-1
DO - 10.1186/s13634-023-01014-1
M3 - 文章
AN - SCOPUS:85158953718
SN - 1687-6172
VL - 2023
JO - Eurasip Journal on Advances in Signal Processing
JF - Eurasip Journal on Advances in Signal Processing
IS - 1
M1 - 48
ER -