Abstract
Noise reduction, which aims at estimating a clean speech from a noisy observation, has long been an active research area. The standard approach to this problem is to obtain the clean speech estimate by linearly filtering the noisy signal. The core issue, then, becomes how to design an optimal linear filter that can significantly suppress noise without introducing perceptually noticeable speech distortion. Traditionally, the optimal noise-reduction filters are formulated in either the time or the frequency domains. This paper studies the problem in the Karhunen-Loàve expansion domain. We develop two classes of optimal filters. The first class achieves a frame of speech estimate by filtering the corresponding frame of the noisy speech. We will show that many existing methods such as the widely used Wiener filter and subspace technique are closely related to this category. The second class obtains noise reduction by filtering not only the current frame, but also a number of previous consecutive frames of the noisy speech. We will discuss how to design the optimal noise-reduction filters in each class and demonstrate, through both theoretical analysis and experiments, the properties of the deduced optimal filters.
Original language | English |
---|---|
Article number | 4806284 |
Pages (from-to) | 787-802 |
Number of pages | 16 |
Journal | IEEE Transactions on Audio, Speech and Language Processing |
Volume | 17 |
Issue number | 4 |
DOIs | |
State | Published - May 2009 |
Externally published | Yes |
Keywords
- Karhunen-Loàve expansion (KLE)
- Maximum signal-to-noise ratio (SNR) filter
- Noise reduction
- Pearson correlation coefficient
- Speech enhancement
- Subspace approach
- Wiener filter