Auditory filterbanks benefit universal sound source separation

Han Li; Kean Chen; Bernhard U. Seeber

doi:10.1109/ICASSP39728.2021.9414105

Auditory filterbanks benefit universal sound source separation

Han Li, Kean Chen, Bernhard U. Seeber

School of Marine Science and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

6 Scopus citations

Abstract

For separating two arbitrary sources from monaural recordings, the encoder-separator-decoder framework is popular in recent years. We investigated three kinds of filterbanks in the encoder: free, parameterized, and fixed. We proposed parameterized Gammatone and Gammachirp filterbanks, which improved performance with fewer parameters and better interpretability. Next, the properties of different filterbanks were investigated. Through training the network, an entirely freely learned filterbank emerges with properties similar to a series of bandpass filters spaced on a nonlinear scale - similar to the auditory system. We also explored the underlying separation mechanisms learned by the network through a classic auditory segregation experiment, revealing that the model separates mixtures based on the general principle (proximity of frequency and time). In summary, results demonstrate that the separation network automatically picks up the filterbank properties and separation mechanisms that are similar to those which have developed over millions of years in humans.

Original language	English
Title of host publication	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	181-185
Number of pages	5
ISBN (Electronic)	9781728176055
DOIs	https://doi.org/10.1109/ICASSP39728.2021.9414105
State	Published - 2021
Event	2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada Duration: 6 Jun 2021 → 11 Jun 2021

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2021-June
ISSN (Print)	1520-6149

Conference

Conference	2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021
Country/Territory	Canada
City	Virtual, Toronto
Period	6/06/21 → 11/06/21

Keywords

Learnable filterbank
Separation mechanisms
Universal source separation

Access to Document

10.1109/ICASSP39728.2021.9414105

Cite this

Li, H., Chen, K., & Seeber, B. U. (2021). Auditory filterbanks benefit universal sound source separation. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 181-185). (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2021-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP39728.2021.9414105

Li, Han ; Chen, Kean ; Seeber, Bernhard U. / Auditory filterbanks benefit universal sound source separation. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2021. pp. 181-185 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{05e7212c4efe4a7590ba4111101ada3e,

title = "Auditory filterbanks benefit universal sound source separation",

abstract = "For separating two arbitrary sources from monaural recordings, the encoder-separator-decoder framework is popular in recent years. We investigated three kinds of filterbanks in the encoder: free, parameterized, and fixed. We proposed parameterized Gammatone and Gammachirp filterbanks, which improved performance with fewer parameters and better interpretability. Next, the properties of different filterbanks were investigated. Through training the network, an entirely freely learned filterbank emerges with properties similar to a series of bandpass filters spaced on a nonlinear scale - similar to the auditory system. We also explored the underlying separation mechanisms learned by the network through a classic auditory segregation experiment, revealing that the model separates mixtures based on the general principle (proximity of frequency and time). In summary, results demonstrate that the separation network automatically picks up the filterbank properties and separation mechanisms that are similar to those which have developed over millions of years in humans.",

keywords = "Learnable filterbank, Separation mechanisms, Universal source separation",

author = "Han Li and Kean Chen and Seeber, {Bernhard U.}",

note = "Publisher Copyright: {\textcopyright}2021 IEEE; 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 ; Conference date: 06-06-2021 Through 11-06-2021",

year = "2021",

doi = "10.1109/ICASSP39728.2021.9414105",

language = "英语",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "181--185",

booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

Li, H, Chen, K & Seeber, BU 2021, Auditory filterbanks benefit universal sound source separation. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2021-June, Institute of Electrical and Electronics Engineers Inc., pp. 181-185, 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021, Virtual, Toronto, Canada, 6/06/21. https://doi.org/10.1109/ICASSP39728.2021.9414105

Auditory filterbanks benefit universal sound source separation. / Li, Han; Chen, Kean; Seeber, Bernhard U.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2021. p. 181-185 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2021-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Auditory filterbanks benefit universal sound source separation

AU - Li, Han

AU - Chen, Kean

AU - Seeber, Bernhard U.

PY - 2021

Y1 - 2021

N2 - For separating two arbitrary sources from monaural recordings, the encoder-separator-decoder framework is popular in recent years. We investigated three kinds of filterbanks in the encoder: free, parameterized, and fixed. We proposed parameterized Gammatone and Gammachirp filterbanks, which improved performance with fewer parameters and better interpretability. Next, the properties of different filterbanks were investigated. Through training the network, an entirely freely learned filterbank emerges with properties similar to a series of bandpass filters spaced on a nonlinear scale - similar to the auditory system. We also explored the underlying separation mechanisms learned by the network through a classic auditory segregation experiment, revealing that the model separates mixtures based on the general principle (proximity of frequency and time). In summary, results demonstrate that the separation network automatically picks up the filterbank properties and separation mechanisms that are similar to those which have developed over millions of years in humans.

AB - For separating two arbitrary sources from monaural recordings, the encoder-separator-decoder framework is popular in recent years. We investigated three kinds of filterbanks in the encoder: free, parameterized, and fixed. We proposed parameterized Gammatone and Gammachirp filterbanks, which improved performance with fewer parameters and better interpretability. Next, the properties of different filterbanks were investigated. Through training the network, an entirely freely learned filterbank emerges with properties similar to a series of bandpass filters spaced on a nonlinear scale - similar to the auditory system. We also explored the underlying separation mechanisms learned by the network through a classic auditory segregation experiment, revealing that the model separates mixtures based on the general principle (proximity of frequency and time). In summary, results demonstrate that the separation network automatically picks up the filterbank properties and separation mechanisms that are similar to those which have developed over millions of years in humans.

KW - Learnable filterbank

KW - Separation mechanisms

KW - Universal source separation

UR - http://www.scopus.com/inward/record.url?scp=85115862667&partnerID=8YFLogxK

U2 - 10.1109/ICASSP39728.2021.9414105

DO - 10.1109/ICASSP39728.2021.9414105

M3 - 会议稿件

AN - SCOPUS:85115862667

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 181

EP - 185

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021

Y2 - 6 June 2021 through 11 June 2021

ER -

Li H, Chen K, Seeber BU. Auditory filterbanks benefit universal sound source separation. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2021. p. 181-185. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP39728.2021.9414105

Auditory filterbanks benefit universal sound source separation

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this