Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling

Jixun Yao; Qing Wang; Yi Lei; Pengcheng Guo; Lei Xie; Namin Wang; Jie Liu

doi:10.1109/ICASSP49357.2023.10095120

Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling

Jixun Yao, Qing Wang, Yi Lei, Pengcheng Guo, Lei Xie, Namin Wang, Jie Liu

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

6 Scopus citations

Abstract

Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker verification (ASV)-model-free anonymization framework to protect speaker privacy while preserving speech intelligibility. Although the framework ranked first place in VoicePrivacy 2022 challenge, the anonymization was imperfect, since the speaker distinguishability of the anonymized speech was deteriorated. To address this issue, in this paper, we directly model the formant distribution and fundamental frequency (F0) to represent speaker identity and anonymize the source speech by the uniformly scaling formant and F0. By directly scaling the formant and F0, the speaker distinguishability degradation of the anonymized speech caused by the introduction of other speakers is prevented. The experimental results demonstrate that our proposed framework can improve the speaker distinguishability and significantly outperforms our previous framework in voice distinctiveness. Furthermore, our proposed method can trade off the privacy-utility by using different scaling factors.

Original language	English
Title of host publication	ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728163277
DOIs	https://doi.org/10.1109/ICASSP49357.2023.10095120
State	Published - 2023
Event	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece Duration: 4 Jun 2023 → 10 Jun 2023

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2023-June
ISSN (Print)	1520-6149

Conference

Conference	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Country/Territory	Greece
City	Rhodes Island
Period	4/06/23 → 10/06/23

Keywords

Speaker anonymization
privacy protection
voice privacy challenge

Access to Document

10.1109/ICASSP49357.2023.10095120

Cite this

Yao, J., Wang, Q., Lei, Y., Guo, P., Xie, L., Wang, N., & Liu, J. (2023). Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP49357.2023.10095120

Yao, Jixun ; Wang, Qing ; Lei, Yi et al. / Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{fffa090860a8473a89633d69d276b9ed,

title = "Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling",

abstract = "Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker verification (ASV)-model-free anonymization framework to protect speaker privacy while preserving speech intelligibility. Although the framework ranked first place in VoicePrivacy 2022 challenge, the anonymization was imperfect, since the speaker distinguishability of the anonymized speech was deteriorated. To address this issue, in this paper, we directly model the formant distribution and fundamental frequency (F0) to represent speaker identity and anonymize the source speech by the uniformly scaling formant and F0. By directly scaling the formant and F0, the speaker distinguishability degradation of the anonymized speech caused by the introduction of other speakers is prevented. The experimental results demonstrate that our proposed framework can improve the speaker distinguishability and significantly outperforms our previous framework in voice distinctiveness. Furthermore, our proposed method can trade off the privacy-utility by using different scaling factors.",

keywords = "Speaker anonymization, privacy protection, voice privacy challenge",

author = "Jixun Yao and Qing Wang and Yi Lei and Pengcheng Guo and Lei Xie and Namin Wang and Jie Liu",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 ; Conference date: 04-06-2023 Through 10-06-2023",

year = "2023",

doi = "10.1109/ICASSP49357.2023.10095120",

language = "英语",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings",

}

Yao, J, Wang, Q, Lei, Y, Guo, P, Xie, L, Wang, N & Liu, J 2023, Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2023-June, Institute of Electrical and Electronics Engineers Inc., 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023, Rhodes Island, Greece, 4/06/23. https://doi.org/10.1109/ICASSP49357.2023.10095120

Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. / Yao, Jixun; Wang, Qing; Lei, Yi et al.
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling

AU - Yao, Jixun

AU - Wang, Qing

AU - Lei, Yi

AU - Guo, Pengcheng

AU - Xie, Lei

AU - Wang, Namin

AU - Liu, Jie

PY - 2023

Y1 - 2023

N2 - Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker verification (ASV)-model-free anonymization framework to protect speaker privacy while preserving speech intelligibility. Although the framework ranked first place in VoicePrivacy 2022 challenge, the anonymization was imperfect, since the speaker distinguishability of the anonymized speech was deteriorated. To address this issue, in this paper, we directly model the formant distribution and fundamental frequency (F0) to represent speaker identity and anonymize the source speech by the uniformly scaling formant and F0. By directly scaling the formant and F0, the speaker distinguishability degradation of the anonymized speech caused by the introduction of other speakers is prevented. The experimental results demonstrate that our proposed framework can improve the speaker distinguishability and significantly outperforms our previous framework in voice distinctiveness. Furthermore, our proposed method can trade off the privacy-utility by using different scaling factors.

AB - Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker verification (ASV)-model-free anonymization framework to protect speaker privacy while preserving speech intelligibility. Although the framework ranked first place in VoicePrivacy 2022 challenge, the anonymization was imperfect, since the speaker distinguishability of the anonymized speech was deteriorated. To address this issue, in this paper, we directly model the formant distribution and fundamental frequency (F0) to represent speaker identity and anonymize the source speech by the uniformly scaling formant and F0. By directly scaling the formant and F0, the speaker distinguishability degradation of the anonymized speech caused by the introduction of other speakers is prevented. The experimental results demonstrate that our proposed framework can improve the speaker distinguishability and significantly outperforms our previous framework in voice distinctiveness. Furthermore, our proposed method can trade off the privacy-utility by using different scaling factors.

KW - Speaker anonymization

KW - privacy protection

KW - voice privacy challenge

UR - http://www.scopus.com/inward/record.url?scp=85173658915&partnerID=8YFLogxK

U2 - 10.1109/ICASSP49357.2023.10095120

DO - 10.1109/ICASSP49357.2023.10095120

M3 - 会议稿件

AN - SCOPUS:85173658915

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

BT - ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

Y2 - 4 June 2023 through 10 June 2023

ER -

Yao J, Wang Q, Lei Y, Guo P, Xie L, Wang N et al. Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc. 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP49357.2023.10095120

Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this