Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network

Yichen Yang, Haowen Li, Xianrui Wang, Wen Zhang, Shoji Makino, Jingdong Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Stereophonic music source separation (MSS) is a problem of extracting individual source tracks, e.g. bass, drums, vocals, from a stereo music recording. Deep neural network (DNN) based MSS systems have demonstrated great promise though spatial panning cues and time-frequency spectral structures in stereo music have not yet been fully explored in such systems and methods. This paper presents a spatially-informed MSS method using a bridging band-split neural network that incorporates both spatial and spectral information. The spatial panning angles of each target source are used as input of the network, along with the time-frequency spectrograms. Moreover, the inter-track correlations are exploited for further performance improvement. Experiments show that the proposed method outperforms significantly the baseline systems as the result of using spatial cues, spectral characteristics, and inter-track relationships.

Original languageEnglish
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages786-790
Number of pages5
ISBN (Electronic)9798350344851
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24

Keywords

  • Stereophonic music source separation
  • bridging band-split network
  • spatial information

Fingerprint

Dive into the research topics of 'Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network'. Together they form a unique fingerprint.

Cite this