跳到主要导航 跳到搜索 跳到主要内容

Laplacian eigenmaps for automatic news story segmentation

  • Northwestern Polytechnical University Xian

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

This paper presents a novel lexical-similarity-based approach to automatic story segmentation in broadcast news. When measuring the connection between a pair of sentences, we take two factors into consideration, i.e. the lexical similarity and the distance between them in the text stream. Further investigation of pairwise connections between sentences is based on the technique of Laplacian Eigenmaps (LE). Taking advantage of the LE algorithm, we construct a Euclidean space in which each sentence is mapped to a vector. The original connective strength between sentences is reflected by the Euclidean distances between the corresponding vectors in the target space of the map. Further analysis of the map leads to a straightforward criterion for optimal segmentation. Then we formalize story segmentation as a minimization problem and give a dynamic programming solution to it. Experimental results on the TDT2 corpus show that the proposed method outperforms several state-of-the-art lexical-similarity-based methods.

源语言英语
主期刊名ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings
419-424
页数6
DOI
出版状态已出版 - 2010
活动2010 International Conference on Audio, Language and Image Processing, ICALIP 2010 - Shanghai, 中国
期限: 23 11月 201025 11月 2010

出版系列

姓名ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings

会议

会议2010 International Conference on Audio, Language and Image Processing, ICALIP 2010
国家/地区中国
Shanghai
时期23/11/1025/11/10

指纹

探究 'Laplacian eigenmaps for automatic news story segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此