Speech Enhancement in the STFT Domain

Jacob Benesty; Jingdong Chen; Emanuël A.P. Habets

doi:10.1007/978-3-642-23250-3

Speech Enhancement in the STFT Domain

Jacob Benesty, Jingdong Chen, Emanuël A.P. Habets

航海学院

科研成果: 书/报告/会议事项章节 › 章节 › 同行评审

21 引用（Scopus）

摘要

This work addresses this problem in the short-time Fourier transform (STFT) domain. We divide the general problem into five basic categories depending on the number of microphones being used and whether the interframe or interband correlation is considered. The first category deals with the single-channel problem where STFT coefficients at different frames and frequency bands are assumed to be independent. In this case, the noise reduction filter in each frequency band is basically a real gain. Since a gain does not improve the signal-to-noise ratio (SNR) for any given subband and frame, the noise reduction is basically achieved by liftering the subbands and frames that are less noisy while weighing down on those that are more noisy. The second category also concerns the single-channel problem. The difference is that now the interframe correlation is taken into account and a filter is applied in each subband instead of just a gain. The advantage of using the interframe correlation is that we can improve not only the long-time fullband SNR, but the frame-wise subband SNR as well. The third and fourth classes discuss the problem of multichannel noise reduction in the STFT domain with and without interframe correlation, respectively. In the last category, we consider the interband correlation in the design of the noise reduction filters. We illustrate the basic principle for the single-channel case as an example, while this concept can be generalized to other scenarios. In all categories, we propose different optimization cost functions from which we derive the optimal filters and we also define the performance measures that help analyzing them.

源语言	英语
主期刊名	SpringerBriefs in Speech Technology
出版商	Springer Science and Business Media B.V.
页	1-106
页数	106
DOI	https://doi.org/10.1007/978-3-642-23250-3
出版状态	已出版 - 2012

出版系列

姓名	SpringerBriefs in Speech Technology
卷	Part F4339
ISSN（印刷版）	2191-737X
ISSN（电子版）	2191-7388

访问文件

10.1007/978-3-642-23250-3

其它文件与链接

链接到 Scopus 的出版物

引用此

@inbook{b6e7df60b9794720bf25e997b233f6e5,

title = "Speech Enhancement in the STFT Domain",

abstract = "This work addresses this problem in the short-time Fourier transform (STFT) domain. We divide the general problem into five basic categories depending on the number of microphones being used and whether the interframe or interband correlation is considered. The first category deals with the single-channel problem where STFT coefficients at different frames and frequency bands are assumed to be independent. In this case, the noise reduction filter in each frequency band is basically a real gain. Since a gain does not improve the signal-to-noise ratio (SNR) for any given subband and frame, the noise reduction is basically achieved by liftering the subbands and frames that are less noisy while weighing down on those that are more noisy. The second category also concerns the single-channel problem. The difference is that now the interframe correlation is taken into account and a filter is applied in each subband instead of just a gain. The advantage of using the interframe correlation is that we can improve not only the long-time fullband SNR, but the frame-wise subband SNR as well. The third and fourth classes discuss the problem of multichannel noise reduction in the STFT domain with and without interframe correlation, respectively. In the last category, we consider the interband correlation in the design of the noise reduction filters. We illustrate the basic principle for the single-channel case as an example, while this concept can be generalized to other scenarios. In all categories, we propose different optimization cost functions from which we derive the optimal filters and we also define the performance measures that help analyzing them.",

keywords = "linearly constrained minimum variance (LCMV) filter, maximum signal-to-noise ratio (SNR) filter, microphone arrays, minimum variance distortionless response (MVDR) filter, prediction filter, short-time Fourier transform (STFT) domain, single-channel and multichannel, Speech enhancement, tradeoff filter, Wiener filter",

author = "Jacob Benesty and Jingdong Chen and Habets, {Emanu{\"e}l A.P.}",

note = "Publisher Copyright: The Author(s) 2012.",

year = "2012",

doi = "10.1007/978-3-642-23250-3",

language = "英语",

series = "SpringerBriefs in Speech Technology",

publisher = "Springer Science and Business Media B.V.",

pages = "1--106",

booktitle = "SpringerBriefs in Speech Technology",

}

TY - CHAP

T1 - Speech Enhancement in the STFT Domain

AU - Benesty, Jacob

AU - Chen, Jingdong

AU - Habets, Emanuël A.P.

N1 - Publisher Copyright: The Author(s) 2012.

PY - 2012

Y1 - 2012

N2 - This work addresses this problem in the short-time Fourier transform (STFT) domain. We divide the general problem into five basic categories depending on the number of microphones being used and whether the interframe or interband correlation is considered. The first category deals with the single-channel problem where STFT coefficients at different frames and frequency bands are assumed to be independent. In this case, the noise reduction filter in each frequency band is basically a real gain. Since a gain does not improve the signal-to-noise ratio (SNR) for any given subband and frame, the noise reduction is basically achieved by liftering the subbands and frames that are less noisy while weighing down on those that are more noisy. The second category also concerns the single-channel problem. The difference is that now the interframe correlation is taken into account and a filter is applied in each subband instead of just a gain. The advantage of using the interframe correlation is that we can improve not only the long-time fullband SNR, but the frame-wise subband SNR as well. The third and fourth classes discuss the problem of multichannel noise reduction in the STFT domain with and without interframe correlation, respectively. In the last category, we consider the interband correlation in the design of the noise reduction filters. We illustrate the basic principle for the single-channel case as an example, while this concept can be generalized to other scenarios. In all categories, we propose different optimization cost functions from which we derive the optimal filters and we also define the performance measures that help analyzing them.

AB - This work addresses this problem in the short-time Fourier transform (STFT) domain. We divide the general problem into five basic categories depending on the number of microphones being used and whether the interframe or interband correlation is considered. The first category deals with the single-channel problem where STFT coefficients at different frames and frequency bands are assumed to be independent. In this case, the noise reduction filter in each frequency band is basically a real gain. Since a gain does not improve the signal-to-noise ratio (SNR) for any given subband and frame, the noise reduction is basically achieved by liftering the subbands and frames that are less noisy while weighing down on those that are more noisy. The second category also concerns the single-channel problem. The difference is that now the interframe correlation is taken into account and a filter is applied in each subband instead of just a gain. The advantage of using the interframe correlation is that we can improve not only the long-time fullband SNR, but the frame-wise subband SNR as well. The third and fourth classes discuss the problem of multichannel noise reduction in the STFT domain with and without interframe correlation, respectively. In the last category, we consider the interband correlation in the design of the noise reduction filters. We illustrate the basic principle for the single-channel case as an example, while this concept can be generalized to other scenarios. In all categories, we propose different optimization cost functions from which we derive the optimal filters and we also define the performance measures that help analyzing them.

KW - linearly constrained minimum variance (LCMV) filter

KW - maximum signal-to-noise ratio (SNR) filter

KW - microphone arrays

KW - minimum variance distortionless response (MVDR) filter

KW - prediction filter

KW - short-time Fourier transform (STFT) domain

KW - single-channel and multichannel

KW - Speech enhancement

KW - tradeoff filter

KW - Wiener filter

UR - http://www.scopus.com/inward/record.url?scp=105004724842&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-23250-3

DO - 10.1007/978-3-642-23250-3

M3 - 章节

AN - SCOPUS:105004724842

T3 - SpringerBriefs in Speech Technology

SP - 1

EP - 106

BT - SpringerBriefs in Speech Technology

PB - Springer Science and Business Media B.V.

ER -

Speech Enhancement in the STFT Domain

摘要

出版系列

访问文件

其它文件与链接

指纹

引用此