Abstract
This paper proposes to integrate probabilistic latent semantic analysis (PLSA) and Laplacian Eigenmaps (LE) for broadcast news story segmentation. PLSA can address synonymy and polysemy problems by exploring underlying semantic relations beneath the actual occurrences of words. LE can provide a data transformation with the advantage of preserving the original temporal structure of sentence cohesive relations.We adopt PLSA statistics to replace term frequency as the representation of sentences and measure their connective strength. LE analysis is then performed on the connective strength matrix so that the sentence relations becomes geometrically evident for discriminating different stories. A dynamic programming (DP) algorithm is used for story boundary identification. Experiments show that the proposed method achieves superior story segmentation performances with the highest F1-measure of 0:7536 on TDT2 Mandarin BN corpus.
| Original language | English |
|---|---|
| Pages | 356-360 |
| Number of pages | 5 |
| State | Published - 2011 |
| Event | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 - Xi'an, China Duration: 18 Oct 2011 → 21 Oct 2011 |
Conference
| Conference | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 |
|---|---|
| Country/Territory | China |
| City | Xi'an |
| Period | 18/10/11 → 21/10/11 |
Fingerprint
Dive into the research topics of 'Broadcast news story segmentation using probabilistic latent semantic analysis and laplacian eigenmaps'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver