Lexical story co-segmentation of Chinese broadcast news

Wei Feng, Xuecheng Nie, Liang Wan, Lei Xie, Jianmin Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

We present an unsupervised technique, namely story cosegmentation, to automatically extract the common stories on the same topic within a pair of Chinese broadcast news transcripts. Unlike classical topic tracking that usually relies on previously trained topic models, our method is purely data-driven and is able to simultaneously determine the common stories of the input texts. Specifically, we propose an iterative four-step MRF solution to the problem of story co-segmentation using lexical cues only. We first construct a sentence-level graph formulation of the input news transcripts, and initialize foreground and background labeling by lexical clustering. We then update both foreground and background models based on the current labeling. We formalize story co-segmentation as a Gibbs energy minimization problem that balances the optimal objectives of foreground/background likelihood, intra-doc coherence, and inter-doc similarity. Finally, the labeling refinement is obtained by hybrid optimization with QPBO and BP. The effectiveness of our method has been validated on real-world CCTV corpus.

Original languageEnglish
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages2283-2286
Number of pages4
StatePublished - 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: 9 Sep 201213 Sep 2012

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume3

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Country/TerritoryUnited States
CityPortland, OR
Period9/09/1213/09/12

Keywords

  • Belief propagation (BP)
  • Foreground and background story modeling
  • Lexical clustering
  • MRF
  • QPBO
  • Story co-segmentation

Fingerprint

Dive into the research topics of 'Lexical story co-segmentation of Chinese broadcast news'. Together they form a unique fingerprint.

Cite this