The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge

Xiaopeng Yan; Yindi Yang; Zhihao Guo; Liangliang Peng; Lei Xie

doi:10.1109/ICASSP49357.2023.10096362

The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge

Xiaopeng Yan, Yindi Yang, Zhihao Guo, Liangliang Peng, Lei Xie

School of Computer Science

Research output: Contribution to journal › Conference article › peer-review

5 Scopus citations

Abstract

This paper describes our NPU-Elevoc personalized speech enhancement system (NAPSE) for the 5th Deep Noise Suppression Challenge[1] at ICASSP 2023. Based on the superior two-stage model TEA-PSE 2.0 [2], our system particularly explores better strategy for speaker embedding fusion, optimizes the model training pipeline, and leverages adversarial training and multi-scale loss. According to the results¹², our system is tied for the 1st place in the headset track (track 1) and ranked 2nd in the speakerphone track (track 2).

Original language	English
Journal	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
DOIs	https://doi.org/10.1109/ICASSP49357.2023.10096362
State	Published - 2023
Event	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece Duration: 4 Jun 2023 → 10 Jun 2023

Keywords

deep learning
generative adversarial network
personalized speech enhancement
real-time

Access to Document

10.1109/ICASSP49357.2023.10096362

Cite this

@article{9e29405b42c440258c5a3a4c97165135,

title = "The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge",

abstract = "This paper describes our NPU-Elevoc personalized speech enhancement system (NAPSE) for the 5th Deep Noise Suppression Challenge[1] at ICASSP 2023. Based on the superior two-stage model TEA-PSE 2.0 [2], our system particularly explores better strategy for speaker embedding fusion, optimizes the model training pipeline, and leverages adversarial training and multi-scale loss. According to the results12, our system is tied for the 1st place in the headset track (track 1) and ranked 2nd in the speakerphone track (track 2).",

keywords = "deep learning, generative adversarial network, personalized speech enhancement, real-time",

author = "Xiaopeng Yan and Yindi Yang and Zhihao Guo and Liangliang Peng and Lei Xie",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 ; Conference date: 04-06-2023 Through 10-06-2023",

year = "2023",

doi = "10.1109/ICASSP49357.2023.10096362",

language = "英语",

journal = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

issn = "1520-6149",

}

TY - JOUR

T1 - The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge

AU - Yan, Xiaopeng

AU - Yang, Yindi

AU - Guo, Zhihao

AU - Peng, Liangliang

AU - Xie, Lei

PY - 2023

Y1 - 2023

N2 - This paper describes our NPU-Elevoc personalized speech enhancement system (NAPSE) for the 5th Deep Noise Suppression Challenge[1] at ICASSP 2023. Based on the superior two-stage model TEA-PSE 2.0 [2], our system particularly explores better strategy for speaker embedding fusion, optimizes the model training pipeline, and leverages adversarial training and multi-scale loss. According to the results12, our system is tied for the 1st place in the headset track (track 1) and ranked 2nd in the speakerphone track (track 2).

AB - This paper describes our NPU-Elevoc personalized speech enhancement system (NAPSE) for the 5th Deep Noise Suppression Challenge[1] at ICASSP 2023. Based on the superior two-stage model TEA-PSE 2.0 [2], our system particularly explores better strategy for speaker embedding fusion, optimizes the model training pipeline, and leverages adversarial training and multi-scale loss. According to the results12, our system is tied for the 1st place in the headset track (track 1) and ranked 2nd in the speakerphone track (track 2).

KW - deep learning

KW - generative adversarial network

KW - personalized speech enhancement

KW - real-time

UR - http://www.scopus.com/inward/record.url?scp=85174794590&partnerID=8YFLogxK

U2 - 10.1109/ICASSP49357.2023.10096362

DO - 10.1109/ICASSP49357.2023.10096362

M3 - 会议文章

AN - SCOPUS:85174794590

SN - 1520-6149

JO - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

JF - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

Y2 - 4 June 2023 through 10 June 2023

ER -

The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this