RC-MES: A novel speaker modeling technique based on regression class for speaker identification

Zhong Hua Fu; Lei Xie; Rong Chun Zhao

RC-MES: A novel speaker modeling technique based on regression class for speaker identification

Zhong Hua Fu, Lei Xie, Rong Chun Zhao

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Speaker modeling technique is an essential problem to robust speaker recognition, especially when enrolment data is sparse. This paper presents a novel modeling approach named Multi-EigenSpace modeling technique based on Regression Class (RC-MES), which integrates the common eigenspace technique and the regression class (RC) idea of Maximum Likelihood Linear Regression (MLLR). RC-MES not only solves the problem of prior knowledge limitation of Gaussian Mixture Models (GMM) but also remedies the shortcoming of common eigenspace that confuses speaker differences and phoneme differences. The eigenvoice analysis in RC can provide better discrimination ability between different speakers. The experimental results on speaker identification of 75 males show that, when enrolment data is sparse, RC-MES provides significant improvement over GMM, and the number of eigenvoices in RC-MES is fewer than that in common eigenspace.

Original language	English
Title of host publication	2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
Pages	214-217
Number of pages	4
State	Published - 2004
Event	2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004 - Hong Kong, China, Hong Kong Duration: 20 Oct 2004 → 22 Oct 2004

Publication series

Name	2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004

Conference

Conference	2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
Country/Territory	Hong Kong
City	Hong Kong, China
Period	20/10/04 → 22/10/04

Cite this

@inproceedings{c17904d4ef6b4ea1b637855c97707f6e,

title = "RC-MES: A novel speaker modeling technique based on regression class for speaker identification",

abstract = "Speaker modeling technique is an essential problem to robust speaker recognition, especially when enrolment data is sparse. This paper presents a novel modeling approach named Multi-EigenSpace modeling technique based on Regression Class (RC-MES), which integrates the common eigenspace technique and the regression class (RC) idea of Maximum Likelihood Linear Regression (MLLR). RC-MES not only solves the problem of prior knowledge limitation of Gaussian Mixture Models (GMM) but also remedies the shortcoming of common eigenspace that confuses speaker differences and phoneme differences. The eigenvoice analysis in RC can provide better discrimination ability between different speakers. The experimental results on speaker identification of 75 males show that, when enrolment data is sparse, RC-MES provides significant improvement over GMM, and the number of eigenvoices in RC-MES is fewer than that in common eigenspace.",

author = "Fu, {Zhong Hua} and Lei Xie and Zhao, {Rong Chun}",

year = "2004",

language = "英语",

isbn = "0780386884",

series = "2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004",

pages = "214--217",

booktitle = "2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004",

note = "2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004 ; Conference date: 20-10-2004 Through 22-10-2004",

}

Fu, ZH, Xie, L & Zhao, RC 2004, RC-MES: A novel speaker modeling technique based on regression class for speaker identification. in 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004. 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004, pp. 214-217, 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004, Hong Kong, China, Hong Kong, 20/10/04.

RC-MES: A novel speaker modeling technique based on regression class for speaker identification. / Fu, Zhong Hua; Xie, Lei; Zhao, Rong Chun.
2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004. 2004. p. 214-217 (2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - RC-MES

T2 - 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004

AU - Fu, Zhong Hua

AU - Xie, Lei

AU - Zhao, Rong Chun

PY - 2004

Y1 - 2004

N2 - Speaker modeling technique is an essential problem to robust speaker recognition, especially when enrolment data is sparse. This paper presents a novel modeling approach named Multi-EigenSpace modeling technique based on Regression Class (RC-MES), which integrates the common eigenspace technique and the regression class (RC) idea of Maximum Likelihood Linear Regression (MLLR). RC-MES not only solves the problem of prior knowledge limitation of Gaussian Mixture Models (GMM) but also remedies the shortcoming of common eigenspace that confuses speaker differences and phoneme differences. The eigenvoice analysis in RC can provide better discrimination ability between different speakers. The experimental results on speaker identification of 75 males show that, when enrolment data is sparse, RC-MES provides significant improvement over GMM, and the number of eigenvoices in RC-MES is fewer than that in common eigenspace.

AB - Speaker modeling technique is an essential problem to robust speaker recognition, especially when enrolment data is sparse. This paper presents a novel modeling approach named Multi-EigenSpace modeling technique based on Regression Class (RC-MES), which integrates the common eigenspace technique and the regression class (RC) idea of Maximum Likelihood Linear Regression (MLLR). RC-MES not only solves the problem of prior knowledge limitation of Gaussian Mixture Models (GMM) but also remedies the shortcoming of common eigenspace that confuses speaker differences and phoneme differences. The eigenvoice analysis in RC can provide better discrimination ability between different speakers. The experimental results on speaker identification of 75 males show that, when enrolment data is sparse, RC-MES provides significant improvement over GMM, and the number of eigenvoices in RC-MES is fewer than that in common eigenspace.

UR - http://www.scopus.com/inward/record.url?scp=14544295644&partnerID=8YFLogxK

M3 - 会议稿件

AN - SCOPUS:14544295644

SN - 0780386884

T3 - 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004

SP - 214

EP - 217

BT - 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004

Y2 - 20 October 2004 through 22 October 2004

ER -

RC-MES: A novel speaker modeling technique based on regression class for speaker identification

Abstract

Publication series

Conference

Other files and links

Fingerprint

Cite this