A new speech feature insensitive to the variation of different speakers

Jingdong Chen; Bo Xu; Taiyi Huang

A new speech feature insensitive to the variation of different speakers

Jingdong Chen, Bo Xu, Taiyi Huang

CAS - Institute of Automation

Research output: Contribution to journal › Article › peer-review

Abstract

A novel robust speech feature which is based on the modified Mellin transform is proposed in this paper. Because of the scale invariance property of the modified Mellin transform, the new feature is insensitive to the variation of the vocal tract length among individual speakers, and thus it is more appropriate for speaker-independent speech recognition than the popularly used melscale frequency cepstral coefficients (MFCC). Experiment has been performed and the result shows that, in comparison with the MFCC, the new feature is able to not only improve the performance of a speaker-independent speech recognizer effectively, but also greatly reduce the standard deviation of the error rates for different outlier speakers.

Original language	English
Pages (from-to)	70-72
Number of pages	3
Journal	Chinese Journal of Electronics
Volume	8
Issue number	1
State	Published - 1999
Externally published	Yes

Keywords

Mellin transform
Speech recognition

Cite this

@article{e76e0cc037ef4431a470e304953daae1,

title = "A new speech feature insensitive to the variation of different speakers",

abstract = "A novel robust speech feature which is based on the modified Mellin transform is proposed in this paper. Because of the scale invariance property of the modified Mellin transform, the new feature is insensitive to the variation of the vocal tract length among individual speakers, and thus it is more appropriate for speaker-independent speech recognition than the popularly used melscale frequency cepstral coefficients (MFCC). Experiment has been performed and the result shows that, in comparison with the MFCC, the new feature is able to not only improve the performance of a speaker-independent speech recognizer effectively, but also greatly reduce the standard deviation of the error rates for different outlier speakers.",

keywords = "Mellin transform, Speech recognition",

author = "Jingdong Chen and Bo Xu and Taiyi Huang",

year = "1999",

language = "英语",

volume = "8",

pages = "70--72",

journal = "Chinese Journal of Electronics",

issn = "1022-4653",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "1",

}

TY - JOUR

T1 - A new speech feature insensitive to the variation of different speakers

AU - Chen, Jingdong

AU - Xu, Bo

AU - Huang, Taiyi

PY - 1999

Y1 - 1999

N2 - A novel robust speech feature which is based on the modified Mellin transform is proposed in this paper. Because of the scale invariance property of the modified Mellin transform, the new feature is insensitive to the variation of the vocal tract length among individual speakers, and thus it is more appropriate for speaker-independent speech recognition than the popularly used melscale frequency cepstral coefficients (MFCC). Experiment has been performed and the result shows that, in comparison with the MFCC, the new feature is able to not only improve the performance of a speaker-independent speech recognizer effectively, but also greatly reduce the standard deviation of the error rates for different outlier speakers.

AB - A novel robust speech feature which is based on the modified Mellin transform is proposed in this paper. Because of the scale invariance property of the modified Mellin transform, the new feature is insensitive to the variation of the vocal tract length among individual speakers, and thus it is more appropriate for speaker-independent speech recognition than the popularly used melscale frequency cepstral coefficients (MFCC). Experiment has been performed and the result shows that, in comparison with the MFCC, the new feature is able to not only improve the performance of a speaker-independent speech recognizer effectively, but also greatly reduce the standard deviation of the error rates for different outlier speakers.

KW - Mellin transform

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=24044439583&partnerID=8YFLogxK

M3 - 文章

AN - SCOPUS:24044439583

SN - 1022-4653

VL - 8

SP - 70

EP - 72

JO - Chinese Journal of Electronics

JF - Chinese Journal of Electronics

IS - 1

ER -

A new speech feature insensitive to the variation of different speakers

Abstract

Keywords

Other files and links

Fingerprint

Cite this