Speech enhancement in the Karhunen-Loève expansion domain

Jacob Benesty; Jingdong Chen; Yiteng Huang

doi:10.2200/S00326ED1V01Y201101SAP007

Speech enhancement in the Karhunen-Loève expansion domain

Jacob Benesty, Jingdong Chen, Yiteng Huang

Research output: Chapter in Book/Report/Conference proceeding › Chapter › peer-review

6 Scopus citations

Abstract

This book is devoted to the study of the problem of speech enhancement whose objective is the recovery of a signal of interest (i.e., speech) from noisy observations. Typically, the recovery process is accomplished by passing the noisy observations through a linear filter (or a linear transformation). Since both the desired speech and undesired noise are filtered at the same time, the most critical issue of speech enhancement resides in how to design a proper optimal filter that can fully take advantage of the difference between the speech and noise statistics to mitigate the noise effect as much as possible while maintaining the speech perception identical to its original form. The optimal filters can be designed either in the time domain or in a transform space. As the title indicates, this book will focus on developing and analyzing optimal filters in the Karhunen-Loève expansion (KLE) domain. We begin by describing the basic problem of speech enhancement and the fundamental principles to solve it in the time domain. We then explain how the problem can be equivalently formulated in the KLE domain. Next, we divide the general problem in the KLE domain into four groups, depending on whether interframe and interband information is accounted for, leading to four linear models for speech enhancement in the KLE domain. For each model, we introduce signal processing measures to quantify the performance of speech enhancement, discuss the formation of different cost functions, and address the optimization of these cost functions for the derivation of different optimal filters. Both theoretical analysis and experiments will be provided to study the performance of these filters and the links between the KLE-domain and time-domain optimal filters will be examined.

Original language	English
Title of host publication	Synthesis Lectures on Speech and Audio Processing
Publisher	Morgan and Claypool Publishers
Pages	1-112
Number of pages	112
ISBN (Print)	9781608456055
DOIs	https://doi.org/10.2200/S00326ED1V01Y201101SAP007
State	Published - 5 Jan 2011
Externally published	Yes

Publication series

Name	Synthesis Lectures on Speech and Audio Processing
Volume	7
ISSN (Print)	1932-121X
ISSN (Electronic)	1932-1678

Keywords

Karhunen-Loève expansion (KLE)
KLE domain
maximum signal-to-noise ratio (SNR) filter
minimum variance distortionless response (MVDR) filter.
noise reduction
single-channel microphone signal processing
speech enhancement
time domain
tradeoff filter
Wiener filter

Access to Document

10.2200/S00326ED1V01Y201101SAP007

Cite this

@inbook{2882cbe4c75448418dc7118e88a19f28,

title = "Speech enhancement in the Karhunen-Lo{\`e}ve expansion domain",

abstract = "This book is devoted to the study of the problem of speech enhancement whose objective is the recovery of a signal of interest (i.e., speech) from noisy observations. Typically, the recovery process is accomplished by passing the noisy observations through a linear filter (or a linear transformation). Since both the desired speech and undesired noise are filtered at the same time, the most critical issue of speech enhancement resides in how to design a proper optimal filter that can fully take advantage of the difference between the speech and noise statistics to mitigate the noise effect as much as possible while maintaining the speech perception identical to its original form. The optimal filters can be designed either in the time domain or in a transform space. As the title indicates, this book will focus on developing and analyzing optimal filters in the Karhunen-Lo{\`e}ve expansion (KLE) domain. We begin by describing the basic problem of speech enhancement and the fundamental principles to solve it in the time domain. We then explain how the problem can be equivalently formulated in the KLE domain. Next, we divide the general problem in the KLE domain into four groups, depending on whether interframe and interband information is accounted for, leading to four linear models for speech enhancement in the KLE domain. For each model, we introduce signal processing measures to quantify the performance of speech enhancement, discuss the formation of different cost functions, and address the optimization of these cost functions for the derivation of different optimal filters. Both theoretical analysis and experiments will be provided to study the performance of these filters and the links between the KLE-domain and time-domain optimal filters will be examined.",

keywords = "Karhunen-Lo{\`e}ve expansion (KLE), KLE domain, maximum signal-to-noise ratio (SNR) filter, minimum variance distortionless response (MVDR) filter., noise reduction, single-channel microphone signal processing, speech enhancement, time domain, tradeoff filter, Wiener filter",

author = "Jacob Benesty and Jingdong Chen and Yiteng Huang",

year = "2011",

month = jan,

day = "5",

doi = "10.2200/S00326ED1V01Y201101SAP007",

language = "英语",

isbn = "9781608456055",

series = "Synthesis Lectures on Speech and Audio Processing",

publisher = "Morgan and Claypool Publishers",

pages = "1--112",

booktitle = "Synthesis Lectures on Speech and Audio Processing",

}

TY - CHAP

T1 - Speech enhancement in the Karhunen-Loève expansion domain

AU - Benesty, Jacob

AU - Chen, Jingdong

AU - Huang, Yiteng

PY - 2011/1/5

Y1 - 2011/1/5

N2 - This book is devoted to the study of the problem of speech enhancement whose objective is the recovery of a signal of interest (i.e., speech) from noisy observations. Typically, the recovery process is accomplished by passing the noisy observations through a linear filter (or a linear transformation). Since both the desired speech and undesired noise are filtered at the same time, the most critical issue of speech enhancement resides in how to design a proper optimal filter that can fully take advantage of the difference between the speech and noise statistics to mitigate the noise effect as much as possible while maintaining the speech perception identical to its original form. The optimal filters can be designed either in the time domain or in a transform space. As the title indicates, this book will focus on developing and analyzing optimal filters in the Karhunen-Loève expansion (KLE) domain. We begin by describing the basic problem of speech enhancement and the fundamental principles to solve it in the time domain. We then explain how the problem can be equivalently formulated in the KLE domain. Next, we divide the general problem in the KLE domain into four groups, depending on whether interframe and interband information is accounted for, leading to four linear models for speech enhancement in the KLE domain. For each model, we introduce signal processing measures to quantify the performance of speech enhancement, discuss the formation of different cost functions, and address the optimization of these cost functions for the derivation of different optimal filters. Both theoretical analysis and experiments will be provided to study the performance of these filters and the links between the KLE-domain and time-domain optimal filters will be examined.

AB - This book is devoted to the study of the problem of speech enhancement whose objective is the recovery of a signal of interest (i.e., speech) from noisy observations. Typically, the recovery process is accomplished by passing the noisy observations through a linear filter (or a linear transformation). Since both the desired speech and undesired noise are filtered at the same time, the most critical issue of speech enhancement resides in how to design a proper optimal filter that can fully take advantage of the difference between the speech and noise statistics to mitigate the noise effect as much as possible while maintaining the speech perception identical to its original form. The optimal filters can be designed either in the time domain or in a transform space. As the title indicates, this book will focus on developing and analyzing optimal filters in the Karhunen-Loève expansion (KLE) domain. We begin by describing the basic problem of speech enhancement and the fundamental principles to solve it in the time domain. We then explain how the problem can be equivalently formulated in the KLE domain. Next, we divide the general problem in the KLE domain into four groups, depending on whether interframe and interband information is accounted for, leading to four linear models for speech enhancement in the KLE domain. For each model, we introduce signal processing measures to quantify the performance of speech enhancement, discuss the formation of different cost functions, and address the optimization of these cost functions for the derivation of different optimal filters. Both theoretical analysis and experiments will be provided to study the performance of these filters and the links between the KLE-domain and time-domain optimal filters will be examined.

KW - Karhunen-Loève expansion (KLE)

KW - KLE domain

KW - maximum signal-to-noise ratio (SNR) filter

KW - minimum variance distortionless response (MVDR) filter.

KW - noise reduction

KW - single-channel microphone signal processing

KW - speech enhancement

KW - time domain

KW - tradeoff filter

KW - Wiener filter

UR - http://www.scopus.com/inward/record.url?scp=79551475838&partnerID=8YFLogxK

U2 - 10.2200/S00326ED1V01Y201101SAP007

DO - 10.2200/S00326ED1V01Y201101SAP007

M3 - 章节

AN - SCOPUS:79551475838

SN - 9781608456055

T3 - Synthesis Lectures on Speech and Audio Processing

SP - 1

EP - 112

BT - Synthesis Lectures on Speech and Audio Processing

PB - Morgan and Claypool Publishers

ER -

Speech enhancement in the Karhunen-Loève expansion domain

Abstract

Publication series

Keywords

Access to Document

Other files and links

Fingerprint

Cite this