Laplacian eigenmaps for automatic news story segmentation

Zihan Liu; Lei Xie; Lilei Zheng

doi:10.1109/ICALIP.2010.5684548

Laplacian eigenmaps for automatic news story segmentation

Zihan Liu, Lei Xie, Lilei Zheng

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

This paper presents a novel lexical-similarity-based approach to automatic story segmentation in broadcast news. When measuring the connection between a pair of sentences, we take two factors into consideration, i.e. the lexical similarity and the distance between them in the text stream. Further investigation of pairwise connections between sentences is based on the technique of Laplacian Eigenmaps (LE). Taking advantage of the LE algorithm, we construct a Euclidean space in which each sentence is mapped to a vector. The original connective strength between sentences is reflected by the Euclidean distances between the corresponding vectors in the target space of the map. Further analysis of the map leads to a straightforward criterion for optimal segmentation. Then we formalize story segmentation as a minimization problem and give a dynamic programming solution to it. Experimental results on the TDT2 corpus show that the proposed method outperforms several state-of-the-art lexical-similarity-based methods.

Original language	English
Title of host publication	ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings
Pages	419-424
Number of pages	6
DOIs	https://doi.org/10.1109/ICALIP.2010.5684548
State	Published - 2010
Event	2010 International Conference on Audio, Language and Image Processing, ICALIP 2010 - Shanghai, China Duration: 23 Nov 2010 → 25 Nov 2010

Publication series

Name	ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings

Conference

Conference	2010 International Conference on Audio, Language and Image Processing, ICALIP 2010
Country/Territory	China
City	Shanghai
Period	23/11/10 → 25/11/10

Access to Document

10.1109/ICALIP.2010.5684548

Cite this

Liu, Z., Xie, L., & Zheng, L. (2010). Laplacian eigenmaps for automatic news story segmentation. In ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings (pp. 419-424). Article 5684548 (ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings). https://doi.org/10.1109/ICALIP.2010.5684548

@inproceedings{986739c74e934044b635b5d143608319,

title = "Laplacian eigenmaps for automatic news story segmentation",

abstract = "This paper presents a novel lexical-similarity-based approach to automatic story segmentation in broadcast news. When measuring the connection between a pair of sentences, we take two factors into consideration, i.e. the lexical similarity and the distance between them in the text stream. Further investigation of pairwise connections between sentences is based on the technique of Laplacian Eigenmaps (LE). Taking advantage of the LE algorithm, we construct a Euclidean space in which each sentence is mapped to a vector. The original connective strength between sentences is reflected by the Euclidean distances between the corresponding vectors in the target space of the map. Further analysis of the map leads to a straightforward criterion for optimal segmentation. Then we formalize story segmentation as a minimization problem and give a dynamic programming solution to it. Experimental results on the TDT2 corpus show that the proposed method outperforms several state-of-the-art lexical-similarity-based methods.",

author = "Zihan Liu and Lei Xie and Lilei Zheng",

year = "2010",

doi = "10.1109/ICALIP.2010.5684548",

language = "英语",

isbn = "9781424458653",

series = "ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings",

pages = "419--424",

booktitle = "ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings",

note = "2010 International Conference on Audio, Language and Image Processing, ICALIP 2010 ; Conference date: 23-11-2010 Through 25-11-2010",

}

Liu, Z, Xie, L & Zheng, L 2010, Laplacian eigenmaps for automatic news story segmentation. in ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings., 5684548, ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings, pp. 419-424, 2010 International Conference on Audio, Language and Image Processing, ICALIP 2010, Shanghai, China, 23/11/10. https://doi.org/10.1109/ICALIP.2010.5684548

Laplacian eigenmaps for automatic news story segmentation. / Liu, Zihan; Xie, Lei; Zheng, Lilei.
ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings. 2010. p. 419-424 5684548 (ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Laplacian eigenmaps for automatic news story segmentation

AU - Liu, Zihan

AU - Xie, Lei

AU - Zheng, Lilei

PY - 2010

Y1 - 2010

N2 - This paper presents a novel lexical-similarity-based approach to automatic story segmentation in broadcast news. When measuring the connection between a pair of sentences, we take two factors into consideration, i.e. the lexical similarity and the distance between them in the text stream. Further investigation of pairwise connections between sentences is based on the technique of Laplacian Eigenmaps (LE). Taking advantage of the LE algorithm, we construct a Euclidean space in which each sentence is mapped to a vector. The original connective strength between sentences is reflected by the Euclidean distances between the corresponding vectors in the target space of the map. Further analysis of the map leads to a straightforward criterion for optimal segmentation. Then we formalize story segmentation as a minimization problem and give a dynamic programming solution to it. Experimental results on the TDT2 corpus show that the proposed method outperforms several state-of-the-art lexical-similarity-based methods.

AB - This paper presents a novel lexical-similarity-based approach to automatic story segmentation in broadcast news. When measuring the connection between a pair of sentences, we take two factors into consideration, i.e. the lexical similarity and the distance between them in the text stream. Further investigation of pairwise connections between sentences is based on the technique of Laplacian Eigenmaps (LE). Taking advantage of the LE algorithm, we construct a Euclidean space in which each sentence is mapped to a vector. The original connective strength between sentences is reflected by the Euclidean distances between the corresponding vectors in the target space of the map. Further analysis of the map leads to a straightforward criterion for optimal segmentation. Then we formalize story segmentation as a minimization problem and give a dynamic programming solution to it. Experimental results on the TDT2 corpus show that the proposed method outperforms several state-of-the-art lexical-similarity-based methods.

UR - http://www.scopus.com/inward/record.url?scp=79851506994&partnerID=8YFLogxK

U2 - 10.1109/ICALIP.2010.5684548

DO - 10.1109/ICALIP.2010.5684548

M3 - 会议稿件

AN - SCOPUS:79851506994

SN - 9781424458653

T3 - ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings

SP - 419

EP - 424

BT - ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings

T2 - 2010 International Conference on Audio, Language and Image Processing, ICALIP 2010

Y2 - 23 November 2010 through 25 November 2010

ER -

Laplacian eigenmaps for automatic news story segmentation

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this