An Attention-based Neural Network Approach for Single Channel Speech Enhancement

Xiang Hao; Changhao Shan; Yong Xu; Sining Sun; Lei Xie

doi:10.1109/ICASSP.2019.8683169

An Attention-based Neural Network Approach for Single Channel Speech Enhancement

Xiang Hao, Changhao Shan, Yong Xu, Sining Sun, Lei Xie

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

49 引用（Scopus）

摘要

This paper proposes an attention-based neural network approach for single channel speech enhancement. Our work is inspired by the recent success of attention models in sequence-to-sequence learning. It is intuitive to use attention mechanism in speech enhancement as humans are able to focus on the important speech components in an audio stream with high attention while perceiving the unimportant region (e.g., noise or interference) in low attention, and thus adjust the focal point over time. Specifically, taking noisy spectrum as input, our model is composed of an LSTM based encoder, an attention mechanism and a speech generator, resulting in enhanced spectrum. Experiments show that, as compared with OM-LSA and the LSTM baseline, the proposed attention approach can consistently achieve better performance in terms of speech quality (PESQ) and intelligibility (STOI). More promisingly, the attention-based approach has better generalization ability to unseen noise conditions.

源语言	英语
主期刊名	2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
出版商	Institute of Electrical and Electronics Engineers Inc.
页	6895-6899
页数	5
ISBN（电子版）	9781479981311
DOI	https://doi.org/10.1109/ICASSP.2019.8683169
出版状态	已出版 - 5月 2019
活动	44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, 英国期限: 12 5月 2019 → 17 5月 2019

出版系列

姓名	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
卷	2019-May
ISSN（印刷版）	1520-6149

会议

会议	44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
国家/地区	英国
市	Brighton
时期	12/05/19 → 17/05/19

访问文件

10.1109/ICASSP.2019.8683169

其它文件与链接

链接到 Scopus 的出版物

引用此

Hao, X., Shan, C., Xu, Y., Sun, S., & Xie, L. (2019). An Attention-based Neural Network Approach for Single Channel Speech Enhancement. 在 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (页码 6895-6899). 文章 8683169 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 卷 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8683169

Hao, Xiang ; Shan, Changhao ; Xu, Yong 等. / An Attention-based Neural Network Approach for Single Channel Speech Enhancement. 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. 页码 6895-6899 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{cc9bf928a3034c0280bc90b20ac961ab,

title = "An Attention-based Neural Network Approach for Single Channel Speech Enhancement",

abstract = "This paper proposes an attention-based neural network approach for single channel speech enhancement. Our work is inspired by the recent success of attention models in sequence-to-sequence learning. It is intuitive to use attention mechanism in speech enhancement as humans are able to focus on the important speech components in an audio stream with high attention while perceiving the unimportant region (e.g., noise or interference) in low attention, and thus adjust the focal point over time. Specifically, taking noisy spectrum as input, our model is composed of an LSTM based encoder, an attention mechanism and a speech generator, resulting in enhanced spectrum. Experiments show that, as compared with OM-LSA and the LSTM baseline, the proposed attention approach can consistently achieve better performance in terms of speech quality (PESQ) and intelligibility (STOI). More promisingly, the attention-based approach has better generalization ability to unseen noise conditions.",

keywords = "attention mechanism, neural networks, speech enhancement",

author = "Xiang Hao and Changhao Shan and Yong Xu and Sining Sun and Lei Xie",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 ; Conference date: 12-05-2019 Through 17-05-2019",

year = "2019",

month = may,

doi = "10.1109/ICASSP.2019.8683169",

language = "英语",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "6895--6899",

booktitle = "2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings",

}

Hao, X, Shan, C, Xu, Y, Sun, S & Xie, L 2019, An Attention-based Neural Network Approach for Single Channel Speech Enhancement. 在 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings., 8683169, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 卷 2019-May, Institute of Electrical and Electronics Engineers Inc., 页码 6895-6899, 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, 英国, 12/05/19. https://doi.org/10.1109/ICASSP.2019.8683169

An Attention-based Neural Network Approach for Single Channel Speech Enhancement. / Hao, Xiang; Shan, Changhao; Xu, Yong 等.
2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. 页码 6895-6899 8683169 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 卷 2019-May).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - An Attention-based Neural Network Approach for Single Channel Speech Enhancement

AU - Hao, Xiang

AU - Shan, Changhao

AU - Xu, Yong

AU - Sun, Sining

AU - Xie, Lei

PY - 2019/5

Y1 - 2019/5

N2 - This paper proposes an attention-based neural network approach for single channel speech enhancement. Our work is inspired by the recent success of attention models in sequence-to-sequence learning. It is intuitive to use attention mechanism in speech enhancement as humans are able to focus on the important speech components in an audio stream with high attention while perceiving the unimportant region (e.g., noise or interference) in low attention, and thus adjust the focal point over time. Specifically, taking noisy spectrum as input, our model is composed of an LSTM based encoder, an attention mechanism and a speech generator, resulting in enhanced spectrum. Experiments show that, as compared with OM-LSA and the LSTM baseline, the proposed attention approach can consistently achieve better performance in terms of speech quality (PESQ) and intelligibility (STOI). More promisingly, the attention-based approach has better generalization ability to unseen noise conditions.

AB - This paper proposes an attention-based neural network approach for single channel speech enhancement. Our work is inspired by the recent success of attention models in sequence-to-sequence learning. It is intuitive to use attention mechanism in speech enhancement as humans are able to focus on the important speech components in an audio stream with high attention while perceiving the unimportant region (e.g., noise or interference) in low attention, and thus adjust the focal point over time. Specifically, taking noisy spectrum as input, our model is composed of an LSTM based encoder, an attention mechanism and a speech generator, resulting in enhanced spectrum. Experiments show that, as compared with OM-LSA and the LSTM baseline, the proposed attention approach can consistently achieve better performance in terms of speech quality (PESQ) and intelligibility (STOI). More promisingly, the attention-based approach has better generalization ability to unseen noise conditions.

KW - attention mechanism

KW - neural networks

KW - speech enhancement

UR - http://www.scopus.com/inward/record.url?scp=85068999516&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2019.8683169

DO - 10.1109/ICASSP.2019.8683169

M3 - 会议稿件

AN - SCOPUS:85068999516

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 6895

EP - 6899

BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019

Y2 - 12 May 2019 through 17 May 2019

ER -

Hao X, Shan C, Xu Y, Sun S, Xie L. An Attention-based Neural Network Approach for Single Channel Speech Enhancement. 在 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. 页码 6895-6899. 8683169. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2019.8683169

An Attention-based Neural Network Approach for Single Channel Speech Enhancement

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此