Determined Blind Source Separation With Sinkhorn Divergence-Based Optimal Allocation of the Source Power

Jianyu Wang, Shanzheng Guan, Nicolas Dobigeon, Jingdong Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Blind source separation (BSS) refers to the process of recovering multiple source signals from observations recorded by an array of sensors. Common methods performing BSS, including independent vector analysis (IVA), and independent low-rank matrix analysis (ILRMA), typically rely on second-order models to capture the statistical independence of source signals for separation. However, these methods generally do not account for the implicit structural information across frequency bands, which may lead to model mismatches between the assumed source distributions and the distributions of the separated source signals estimated from the observed mixtures. To tackle these limitations, this paper shows that conventional methods such as IVA and ILRMA can easily be leveraged by the Sinkhorn divergence, incorporating an optimal transport (OT) framework to adaptively adjust the estimated power spectral density (PSD) of the sources. This allows for the recovery of the source PSD while modeling the inter-band signal dependence and reallocating spectral power across frequency bands. As a result, enhanced versions of these methods are developed, integrating a Sinkhorn iterative scheme into their standard implementations. Extensive simulations demonstrate that the proposed methods consistently enhance BSS performance.

Original languageEnglish
Pages (from-to)3961-3974
Number of pages14
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume33
DOIs
StatePublished - 2025

Keywords

  • Blind source separation
  • Sinkhorn divergence
  • Wasserstein distance
  • independent low-rank matrix analysis
  • nonnegative matrix factorization

Fingerprint

Dive into the research topics of 'Determined Blind Source Separation With Sinkhorn Divergence-Based Optimal Allocation of the Source Power'. Together they form a unique fingerprint.

Cite this