TY - JOUR
T1 - Smoothed Frame-Level SINR and Its Estimation for Sensor Selection in Distributed Acoustic Sensor Networks
AU - Guan, Shanzheng
AU - Wang, Mou
AU - Bai, Zhongxin
AU - Wang, Jianyu
AU - Chen, Jingdong
AU - Benesty, Jacob
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Distributed acoustic sensor network (DASN) refers to a sound acquisition system that consists of a collection of microphones randomly distributed across a wide acoustic area. Theory and methods for DASN are gaining increasing attention as the associated technologies can be used in a broad range of applications to solve challenging problems. However, unlike traditional microphone arrays or centralized systems, properly exploiting the redundancy among different channels in DASN is facing many challenges including but not limited to variations in pre-amplification gains, clocks, sensors' response, and signal-to-interference-plus-noise ratios (SINRs). Selecting appropriate sensors relevant to the task at hand is therefore crucial in DASN. In this work, we propose a speaker-dependent smoothed frame-level SINR estimation method for sensor selection in multi-speaker scenarios, specifically addressing source movement within DASN. Additionally, we devise an approach for similarity measurement to generate dynamic speaker embeddings resilient to variations in reference speech levels. Furthermore, we introduce a novel loss function that integrates classification and ordinal regression within a unified framework. Extensive simulations are performed and the results demonstrate the efficacy of the proposed method in accurately estimating smoothed frame-level SINR dynamically, yielding state-of-the-art performance.
AB - Distributed acoustic sensor network (DASN) refers to a sound acquisition system that consists of a collection of microphones randomly distributed across a wide acoustic area. Theory and methods for DASN are gaining increasing attention as the associated technologies can be used in a broad range of applications to solve challenging problems. However, unlike traditional microphone arrays or centralized systems, properly exploiting the redundancy among different channels in DASN is facing many challenges including but not limited to variations in pre-amplification gains, clocks, sensors' response, and signal-to-interference-plus-noise ratios (SINRs). Selecting appropriate sensors relevant to the task at hand is therefore crucial in DASN. In this work, we propose a speaker-dependent smoothed frame-level SINR estimation method for sensor selection in multi-speaker scenarios, specifically addressing source movement within DASN. Additionally, we devise an approach for similarity measurement to generate dynamic speaker embeddings resilient to variations in reference speech levels. Furthermore, we introduce a novel loss function that integrates classification and ordinal regression within a unified framework. Extensive simulations are performed and the results demonstrate the efficacy of the proposed method in accurately estimating smoothed frame-level SINR dynamically, yielding state-of-the-art performance.
KW - Distributed acoustic sensor network (DASN)
KW - sensor/node selection
KW - SINR estimation
UR - http://www.scopus.com/inward/record.url?scp=85207139168&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2024.3477277
DO - 10.1109/TASLP.2024.3477277
M3 - 文章
AN - SCOPUS:85207139168
SN - 2329-9290
VL - 32
SP - 4554
EP - 4568
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
ER -