Unsupervised broadcast news story segmentation using distance dependent Chinese restaurant processes

Chao Yang, Lei Xie, Xiangzeng Zhou

科研成果: 书/报告/会议事项章节会议稿件同行评审

15 引用 (Scopus)

摘要

Traditional unsupervised broadcast news story segmentation approaches have to set the segmentation number manually, while this number is often unknown in real-world applications. In this paper, we solve this problem by modeling the generative process of stories as distance dependent Chinese restaurant process (dd-CRP) mixtures. We cut a news program into fixed-size text blocks and consider these blocks in the same story are generated from a story-specific topic. Specifically, we add a dd-CRP prior which has an essential bias that the blocks' topic is more likely to be the same with the nearby blocks. Subsequently, story boundaries can be found by detecting the changes of topics. Experiments show that our approach outperforms both supervised and unsupervised approaches and the segmentation number can be automatically learned from data.

源语言英语
主期刊名2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
出版商Institute of Electrical and Electronics Engineers Inc.
4062-4066
页数5
ISBN(印刷版)9781479928927
DOI
出版状态已出版 - 2014
活动2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, 意大利
期限: 4 5月 20149 5月 2014

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

会议

会议2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
国家/地区意大利
Florence
时期4/05/149/05/14

指纹

探究 'Unsupervised broadcast news story segmentation using distance dependent Chinese restaurant processes' 的科研主题。它们共同构成独一无二的指纹。

引用此