Abstract
Noise reduction is essential for practical speech recognition systems. In many applications, the target speaker location is fixed, but the interference information such as the type, number and locations are unknown, and may even change over time. This paper presents a semi-blind dual-microphone noise reduction method for these problems which is based on the sparsity of the speech in the time-frequency distribution. The target speaker location is assumed to be known and fixed for building a spatial location model. The spatial location model of the unknown noise is obtained using model adaptation based on the target speaker model. Then, every time-frequency bin of mixed signals is classified to build a binary mask. Finally, the target speech is re-synthesized with the binary mask. Tests show that this approach significantly reduces complicated noise with little speech distortion. The performance is close to that of the un-blind degenerate unmixing estimation method.
Original language | English |
---|---|
Pages (from-to) | 1215-1219+1225 |
Journal | Qinghua Daxue Xuebao/Journal of Tsinghua University |
Volume | 51 |
Issue number | 9 |
State | Published - Sep 2011 |
Keywords
- Binary mask
- Dual microphones
- Noise reduction