Skip to main navigation Skip to search Skip to main content

A two-stage multi-feature integration approach to unsupervised speaker change detection in real-time news broadcasting

  • Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

This paper presents a two-stage multi-feature integration approach for unsupervised speaker change detection in real-time news broadcasting. We integrate MFCC and LSP features (i.e. a perceptual feature plus a articulatory feature) in the metric-based potential speaker change detection stage to collect speaker boundary candidates as many as possible. We adopt a weighted Bayesian information criterion (BIC) to integrate boundary decisions from MFCC and LSP features in the speaker boundary confirmation stage. This multi-feature integration strategy makes use of the complementarity between perceptual features and articulatory features to achieve a performance gain. Speaker change detection experiments show that the multi-feature integration approach significantly outperforms the individual features with relative improvements of 26% over the LSP-only approach and 6% over the MFCC-only approach.

Original languageEnglish
Title of host publicationProceedings - 2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008
Pages350-353
Number of pages4
DOIs
StatePublished - 2008
Event2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008 - Kunming, China
Duration: 16 Dec 200819 Dec 2008

Publication series

NameProceedings - 2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008

Conference

Conference2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008
Country/TerritoryChina
CityKunming
Period16/12/0819/12/08

Keywords

  • Audio content analysis
  • Audio segmentation
  • Speaker change detection
  • Speaker segmentation

Fingerprint

Dive into the research topics of 'A two-stage multi-feature integration approach to unsupervised speaker change detection in real-time news broadcasting'. Together they form a unique fingerprint.

Cite this