Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network

Yichen Yang, Haowen Li, Xianrui Wang, Wen Zhang, Shoji Makino, Jingdong Chen

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Stereophonic music source separation (MSS) is a problem of extracting individual source tracks, e.g. bass, drums, vocals, from a stereo music recording. Deep neural network (DNN) based MSS systems have demonstrated great promise though spatial panning cues and time-frequency spectral structures in stereo music have not yet been fully explored in such systems and methods. This paper presents a spatially-informed MSS method using a bridging band-split neural network that incorporates both spatial and spectral information. The spatial panning angles of each target source are used as input of the network, along with the time-frequency spectrograms. Moreover, the inter-track correlations are exploited for further performance improvement. Experiments show that the proposed method outperforms significantly the baseline systems as the result of using spatial cues, spectral characteristics, and inter-track relationships.

源语言英语
主期刊名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
786-790
页数5
ISBN(电子版)9798350344851
DOI
出版状态已出版 - 2024
活动2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, 韩国
期限: 14 4月 202419 4月 2024

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

会议

会议2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
国家/地区韩国
Seoul
时期14/04/2419/04/24

指纹

探究 'Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network' 的科研主题。它们共同构成独一无二的指纹。

引用此