Model-based voice activity detection in wireless acoustic sensor networks

Yingke Zhao; Jesper Kjær Nielsen; Mads Græsbøll Christensen; Jingdong Chen

doi:10.23919/EUSIPCO.2018.8553457

Model-based voice activity detection in wireless acoustic sensor networks

Yingke Zhao, Jesper Kjær Nielsen, Mads Græsbøll Christensen, Jingdong Chen

航海学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

2 引用（Scopus）

摘要

One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.

源语言	英语
主期刊名	2018 26th European Signal Processing Conference, EUSIPCO 2018
出版商	European Signal Processing Conference, EUSIPCO
页	425-429
页数	5
ISBN（电子版）	9789082797015
DOI	https://doi.org/10.23919/EUSIPCO.2018.8553457
出版状态	已出版 - 29 11月 2018
活动	26th European Signal Processing Conference, EUSIPCO 2018 - Rome, 意大利期限: 3 9月 2018 → 7 9月 2018

出版系列

姓名	European Signal Processing Conference
卷	2018-September
ISSN（印刷版）	2219-5491

会议

会议	26th European Signal Processing Conference, EUSIPCO 2018
国家/地区	意大利
市	Rome
时期	3/09/18 → 7/09/18

访问文件

10.23919/EUSIPCO.2018.8553457

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhao, Y., Nielsen, J. K., Christensen, M. G., & Chen, J. (2018). Model-based voice activity detection in wireless acoustic sensor networks. 在 2018 26th European Signal Processing Conference, EUSIPCO 2018 (页码 425-429). 文章 8553457 (European Signal Processing Conference; 卷 2018-September). European Signal Processing Conference, EUSIPCO. https://doi.org/10.23919/EUSIPCO.2018.8553457

@inproceedings{5cea234a92f64f2e8c3cb125b9c33249,

title = "Model-based voice activity detection in wireless acoustic sensor networks",

abstract = "One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.",

keywords = "Distributed voice activity detection, Noise PSD estimation, Wireless acoustic sensor networks",

author = "Yingke Zhao and Nielsen, {Jesper Kj{\ae}r} and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jingdong Chen",

note = "Publisher Copyright: {\textcopyright} EURASIP 2018.; 26th European Signal Processing Conference, EUSIPCO 2018 ; Conference date: 03-09-2018 Through 07-09-2018",

year = "2018",

month = nov,

day = "29",

doi = "10.23919/EUSIPCO.2018.8553457",

language = "英语",

series = "European Signal Processing Conference",

publisher = "European Signal Processing Conference, EUSIPCO",

pages = "425--429",

booktitle = "2018 26th European Signal Processing Conference, EUSIPCO 2018",

}

Zhao, Y, Nielsen, JK, Christensen, MG & Chen, J 2018, Model-based voice activity detection in wireless acoustic sensor networks. 在 2018 26th European Signal Processing Conference, EUSIPCO 2018., 8553457, European Signal Processing Conference, 卷 2018-September, European Signal Processing Conference, EUSIPCO, 页码 425-429, 26th European Signal Processing Conference, EUSIPCO 2018, Rome, 意大利, 3/09/18. https://doi.org/10.23919/EUSIPCO.2018.8553457

Model-based voice activity detection in wireless acoustic sensor networks. / Zhao, Yingke; Nielsen, Jesper Kjær; Christensen, Mads Græsbøll 等.
2018 26th European Signal Processing Conference, EUSIPCO 2018. European Signal Processing Conference, EUSIPCO, 2018. 页码 425-429 8553457 (European Signal Processing Conference; 卷 2018-September).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Model-based voice activity detection in wireless acoustic sensor networks

AU - Zhao, Yingke

AU - Nielsen, Jesper Kjær

AU - Christensen, Mads Græsbøll

AU - Chen, Jingdong

PY - 2018/11/29

Y1 - 2018/11/29

N2 - One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.

AB - One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.

KW - Distributed voice activity detection

KW - Noise PSD estimation

KW - Wireless acoustic sensor networks

UR - http://www.scopus.com/inward/record.url?scp=85059811358&partnerID=8YFLogxK

U2 - 10.23919/EUSIPCO.2018.8553457

DO - 10.23919/EUSIPCO.2018.8553457

M3 - 会议稿件

AN - SCOPUS:85059811358

T3 - European Signal Processing Conference

SP - 425

EP - 429

BT - 2018 26th European Signal Processing Conference, EUSIPCO 2018

PB - European Signal Processing Conference, EUSIPCO

T2 - 26th European Signal Processing Conference, EUSIPCO 2018

Y2 - 3 September 2018 through 7 September 2018

ER -

Model-based voice activity detection in wireless acoustic sensor networks

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此