Topic segmentation on spoken documents using self-validated acoustic cuts

Hongjie Chen; Lei Xie; Wei Feng; Lilei Zheng; Yanning Zhang

doi:10.1007/s00500-014-1383-9

Topic segmentation on spoken documents using self-validated acoustic cuts

Hongjie Chen, Lei Xie, Wei Feng, Lilei Zheng, Yanning Zhang

School of Computer Science

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

Topic segmentation serves as a necessary prerequisite for multimedia content analysis and management. The normalized cuts (NCuts) approach has shown superior performance in topic segmentation of spoken document. However, in this method, the number of topics in a document has to be known prior to segmentation. This is impractical for real-world applications with exponential growth of multimedia data. On the other hand, previous lexical-based spoken document segmentation approaches, including NCuts, work on text transcripts generated by a large vocabulary continuous speech recognizer (LVCSR). As we know, training such a recognizer requires a large amount of transcribed speech data and language-specific knowledges. Moreover, inevitable speech recognition errors and the out-of-vocabulary (OOV) problem apparently affect the segmentation performance. This paper addresses these problems by a self-validated acoustic normalized cuts approach, namely SACuts. First, as compared with NCuts, our approach can determine the topic number in a spoken document automatically without extra computation load. Second, as compared with lexical approaches that rely on a high-resource speech recognizer, our approach can achieve comparable and even better segmentation performance using only acoustic-level information. Evaluation on a broadcast news topic segmentation task shows the superiority of the proposed approach.

Original language	English
Pages (from-to)	47-59
Number of pages	13
Journal	Soft Computing
Volume	19
Issue number	1
DOIs	https://doi.org/10.1007/s00500-014-1383-9
State	Published - Jan 2014

Keywords

Normalized cuts
Spoken document retrieval
Story segmentation
Topic boundary detection
Topic segmentation

Access to Document

10.1007/s00500-014-1383-9

Cite this

@article{1bdc51960f2044909a56e4063b641477,

title = "Topic segmentation on spoken documents using self-validated acoustic cuts",

abstract = "Topic segmentation serves as a necessary prerequisite for multimedia content analysis and management. The normalized cuts (NCuts) approach has shown superior performance in topic segmentation of spoken document. However, in this method, the number of topics in a document has to be known prior to segmentation. This is impractical for real-world applications with exponential growth of multimedia data. On the other hand, previous lexical-based spoken document segmentation approaches, including NCuts, work on text transcripts generated by a large vocabulary continuous speech recognizer (LVCSR). As we know, training such a recognizer requires a large amount of transcribed speech data and language-specific knowledges. Moreover, inevitable speech recognition errors and the out-of-vocabulary (OOV) problem apparently affect the segmentation performance. This paper addresses these problems by a self-validated acoustic normalized cuts approach, namely SACuts. First, as compared with NCuts, our approach can determine the topic number in a spoken document automatically without extra computation load. Second, as compared with lexical approaches that rely on a high-resource speech recognizer, our approach can achieve comparable and even better segmentation performance using only acoustic-level information. Evaluation on a broadcast news topic segmentation task shows the superiority of the proposed approach.",

keywords = "Normalized cuts, Spoken document retrieval, Story segmentation, Topic boundary detection, Topic segmentation",

author = "Hongjie Chen and Lei Xie and Wei Feng and Lilei Zheng and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 2014, Springer-Verlag Berlin Heidelberg.",

year = "2014",

month = jan,

doi = "10.1007/s00500-014-1383-9",

language = "英语",

volume = "19",

pages = "47--59",

journal = "Soft Computing",

issn = "1432-7643",

publisher = "Springer Science and Business Media Deutschland GmbH",

number = "1",

}

TY - JOUR

T1 - Topic segmentation on spoken documents using self-validated acoustic cuts

AU - Chen, Hongjie

AU - Xie, Lei

AU - Feng, Wei

AU - Zheng, Lilei

AU - Zhang, Yanning

PY - 2014/1

Y1 - 2014/1

N2 - Topic segmentation serves as a necessary prerequisite for multimedia content analysis and management. The normalized cuts (NCuts) approach has shown superior performance in topic segmentation of spoken document. However, in this method, the number of topics in a document has to be known prior to segmentation. This is impractical for real-world applications with exponential growth of multimedia data. On the other hand, previous lexical-based spoken document segmentation approaches, including NCuts, work on text transcripts generated by a large vocabulary continuous speech recognizer (LVCSR). As we know, training such a recognizer requires a large amount of transcribed speech data and language-specific knowledges. Moreover, inevitable speech recognition errors and the out-of-vocabulary (OOV) problem apparently affect the segmentation performance. This paper addresses these problems by a self-validated acoustic normalized cuts approach, namely SACuts. First, as compared with NCuts, our approach can determine the topic number in a spoken document automatically without extra computation load. Second, as compared with lexical approaches that rely on a high-resource speech recognizer, our approach can achieve comparable and even better segmentation performance using only acoustic-level information. Evaluation on a broadcast news topic segmentation task shows the superiority of the proposed approach.

AB - Topic segmentation serves as a necessary prerequisite for multimedia content analysis and management. The normalized cuts (NCuts) approach has shown superior performance in topic segmentation of spoken document. However, in this method, the number of topics in a document has to be known prior to segmentation. This is impractical for real-world applications with exponential growth of multimedia data. On the other hand, previous lexical-based spoken document segmentation approaches, including NCuts, work on text transcripts generated by a large vocabulary continuous speech recognizer (LVCSR). As we know, training such a recognizer requires a large amount of transcribed speech data and language-specific knowledges. Moreover, inevitable speech recognition errors and the out-of-vocabulary (OOV) problem apparently affect the segmentation performance. This paper addresses these problems by a self-validated acoustic normalized cuts approach, namely SACuts. First, as compared with NCuts, our approach can determine the topic number in a spoken document automatically without extra computation load. Second, as compared with lexical approaches that rely on a high-resource speech recognizer, our approach can achieve comparable and even better segmentation performance using only acoustic-level information. Evaluation on a broadcast news topic segmentation task shows the superiority of the proposed approach.

KW - Normalized cuts

KW - Spoken document retrieval

KW - Story segmentation

KW - Topic boundary detection

KW - Topic segmentation

UR - http://www.scopus.com/inward/record.url?scp=84921701501&partnerID=8YFLogxK

U2 - 10.1007/s00500-014-1383-9

DO - 10.1007/s00500-014-1383-9

M3 - 文章

AN - SCOPUS:84921701501

SN - 1432-7643

VL - 19

SP - 47

EP - 59

JO - Soft Computing

JF - Soft Computing

IS - 1

ER -

Topic segmentation on spoken documents using self-validated acoustic cuts

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this