TY - JOUR
T1 - Determined Blind Source Separation With Sinkhorn Divergence-Based Optimal Allocation of the Source Power
AU - Wang, Jianyu
AU - Guan, Shanzheng
AU - Dobigeon, Nicolas
AU - Chen, Jingdong
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Blind source separation (BSS) refers to the process of recovering multiple source signals from observations recorded by an array of sensors. Common methods performing BSS, including independent vector analysis (IVA), and independent low-rank matrix analysis (ILRMA), typically rely on second-order models to capture the statistical independence of source signals for separation. However, these methods generally do not account for the implicit structural information across frequency bands, which may lead to model mismatches between the assumed source distributions and the distributions of the separated source signals estimated from the observed mixtures. To tackle these limitations, this paper shows that conventional methods such as IVA and ILRMA can easily be leveraged by the Sinkhorn divergence, incorporating an optimal transport (OT) framework to adaptively adjust the estimated power spectral density (PSD) of the sources. This allows for the recovery of the source PSD while modeling the inter-band signal dependence and reallocating spectral power across frequency bands. As a result, enhanced versions of these methods are developed, integrating a Sinkhorn iterative scheme into their standard implementations. Extensive simulations demonstrate that the proposed methods consistently enhance BSS performance.
AB - Blind source separation (BSS) refers to the process of recovering multiple source signals from observations recorded by an array of sensors. Common methods performing BSS, including independent vector analysis (IVA), and independent low-rank matrix analysis (ILRMA), typically rely on second-order models to capture the statistical independence of source signals for separation. However, these methods generally do not account for the implicit structural information across frequency bands, which may lead to model mismatches between the assumed source distributions and the distributions of the separated source signals estimated from the observed mixtures. To tackle these limitations, this paper shows that conventional methods such as IVA and ILRMA can easily be leveraged by the Sinkhorn divergence, incorporating an optimal transport (OT) framework to adaptively adjust the estimated power spectral density (PSD) of the sources. This allows for the recovery of the source PSD while modeling the inter-band signal dependence and reallocating spectral power across frequency bands. As a result, enhanced versions of these methods are developed, integrating a Sinkhorn iterative scheme into their standard implementations. Extensive simulations demonstrate that the proposed methods consistently enhance BSS performance.
KW - Blind source separation
KW - Sinkhorn divergence
KW - Wasserstein distance
KW - independent low-rank matrix analysis
KW - nonnegative matrix factorization
UR - https://www.scopus.com/pages/publications/105017671353
U2 - 10.1109/TASLPRO.2025.3609150
DO - 10.1109/TASLPRO.2025.3609150
M3 - 文章
AN - SCOPUS:105017671353
SN - 1558-7916
VL - 33
SP - 3961
EP - 3974
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
ER -