Semi-blind dual-microphone noise reduction with known target localization

Jian Zhang; Zhonghua Fu; Lei Xie; Yali Zhao

Semi-blind dual-microphone noise reduction with known target localization

Jian Zhang, Zhonghua Fu, Lei Xie, Yali Zhao

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Noise reduction is essential for practical speech recognition systems. In many applications, the target speaker location is fixed, but the interference information such as the type, number and locations are unknown, and may even change over time. This paper presents a semi-blind dual-microphone noise reduction method for these problems which is based on the sparsity of the speech in the time-frequency distribution. The target speaker location is assumed to be known and fixed for building a spatial location model. The spatial location model of the unknown noise is obtained using model adaptation based on the target speaker model. Then, every time-frequency bin of mixed signals is classified to build a binary mask. Finally, the target speech is re-synthesized with the binary mask. Tests show that this approach significantly reduces complicated noise with little speech distortion. The performance is close to that of the un-blind degenerate unmixing estimation method.

源语言	英语
页（从-至）	1215-1219+1225
期刊	Qinghua Daxue Xuebao/Journal of Tsinghua University
卷	51
期	9
出版状态	已出版 - 9月 2011

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{85c933c9663742bc901f52aa39bc699e,

title = "Semi-blind dual-microphone noise reduction with known target localization",

abstract = "Noise reduction is essential for practical speech recognition systems. In many applications, the target speaker location is fixed, but the interference information such as the type, number and locations are unknown, and may even change over time. This paper presents a semi-blind dual-microphone noise reduction method for these problems which is based on the sparsity of the speech in the time-frequency distribution. The target speaker location is assumed to be known and fixed for building a spatial location model. The spatial location model of the unknown noise is obtained using model adaptation based on the target speaker model. Then, every time-frequency bin of mixed signals is classified to build a binary mask. Finally, the target speech is re-synthesized with the binary mask. Tests show that this approach significantly reduces complicated noise with little speech distortion. The performance is close to that of the un-blind degenerate unmixing estimation method.",

keywords = "Binary mask, Dual microphones, Noise reduction",

author = "Jian Zhang and Zhonghua Fu and Lei Xie and Yali Zhao",

year = "2011",

month = sep,

language = "英语",

volume = "51",

pages = "1215--1219+1225",

journal = "Qinghua Daxue Xuebao/Journal of Tsinghua University",

issn = "1000-0054",

publisher = "Tsinghua University Press",

number = "9",

}

TY - JOUR

T1 - Semi-blind dual-microphone noise reduction with known target localization

AU - Zhang, Jian

AU - Fu, Zhonghua

AU - Xie, Lei

AU - Zhao, Yali

PY - 2011/9

Y1 - 2011/9

N2 - Noise reduction is essential for practical speech recognition systems. In many applications, the target speaker location is fixed, but the interference information such as the type, number and locations are unknown, and may even change over time. This paper presents a semi-blind dual-microphone noise reduction method for these problems which is based on the sparsity of the speech in the time-frequency distribution. The target speaker location is assumed to be known and fixed for building a spatial location model. The spatial location model of the unknown noise is obtained using model adaptation based on the target speaker model. Then, every time-frequency bin of mixed signals is classified to build a binary mask. Finally, the target speech is re-synthesized with the binary mask. Tests show that this approach significantly reduces complicated noise with little speech distortion. The performance is close to that of the un-blind degenerate unmixing estimation method.

AB - Noise reduction is essential for practical speech recognition systems. In many applications, the target speaker location is fixed, but the interference information such as the type, number and locations are unknown, and may even change over time. This paper presents a semi-blind dual-microphone noise reduction method for these problems which is based on the sparsity of the speech in the time-frequency distribution. The target speaker location is assumed to be known and fixed for building a spatial location model. The spatial location model of the unknown noise is obtained using model adaptation based on the target speaker model. Then, every time-frequency bin of mixed signals is classified to build a binary mask. Finally, the target speech is re-synthesized with the binary mask. Tests show that this approach significantly reduces complicated noise with little speech distortion. The performance is close to that of the un-blind degenerate unmixing estimation method.

KW - Binary mask

KW - Dual microphones

KW - Noise reduction

UR - http://www.scopus.com/inward/record.url?scp=80355140472&partnerID=8YFLogxK

M3 - 文章

AN - SCOPUS:80355140472

SN - 1000-0054

VL - 51

SP - 1215-1219+1225

JO - Qinghua Daxue Xuebao/Journal of Tsinghua University

JF - Qinghua Daxue Xuebao/Journal of Tsinghua University

IS - 9

ER -

Semi-blind dual-microphone noise reduction with known target localization

摘要

其它文件与链接

指纹

引用此