Skip to main navigation Skip to search Skip to main content

Broadcast news story segmentation using probabilistic latent semantic analysis and laplacian eigenmaps

  • Mimi Lu
  • , Lilei Zheng
  • , Cheung Chi Leung
  • , Lei Xie
  • , Bin Ma
  • , Haizhou Li

Research output: Contribution to conferencePaperpeer-review

13 Scopus citations

Abstract

This paper proposes to integrate probabilistic latent semantic analysis (PLSA) and Laplacian Eigenmaps (LE) for broadcast news story segmentation. PLSA can address synonymy and polysemy problems by exploring underlying semantic relations beneath the actual occurrences of words. LE can provide a data transformation with the advantage of preserving the original temporal structure of sentence cohesive relations.We adopt PLSA statistics to replace term frequency as the representation of sentences and measure their connective strength. LE analysis is then performed on the connective strength matrix so that the sentence relations becomes geometrically evident for discriminating different stories. A dynamic programming (DP) algorithm is used for story boundary identification. Experiments show that the proposed method achieves superior story segmentation performances with the highest F1-measure of 0:7536 on TDT2 Mandarin BN corpus.

Original languageEnglish
Pages356-360
Number of pages5
StatePublished - 2011
EventAsia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 - Xi'an, China
Duration: 18 Oct 201121 Oct 2011

Conference

ConferenceAsia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011
Country/TerritoryChina
CityXi'an
Period18/10/1121/10/11

Fingerprint

Dive into the research topics of 'Broadcast news story segmentation using probabilistic latent semantic analysis and laplacian eigenmaps'. Together they form a unique fingerprint.

Cite this