A DNN-HMM approach to story segmentation

Jia Yu; Xiong Xiao; Lei Xie; Eng Siong Chng; Haizhou Li

doi:10.21437/Interspeech.2016-873

A DNN-HMM approach to story segmentation

Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li

School of Computer Science

Research output: Contribution to journal › Conference article › peer-review

20 Scopus citations

Abstract

Hidden Markov model (HMM) is one of the popular techniques for story segmentation, where hidden Markov states represent the topics, and the emission distributions of n-gram language model (LM) are dependent on the states. Given a text docu-ment, a Viterbi decoder finds the hidden story sequence, with a change of topic indicating a story boundary. In this paper, we propose a discriminative approach to story boundary detection. In the HMM framework, we use deep neural network (DNN) to estimate the posterior probability of topics given the bag-of-words in the local context. We call it the DNN-HMM approach. We consider the topic dependent LM as a generative modeling technique, and the DNN-HMM as the discriminative solution. Experiments on topic detection and tracking (TDT2) task show that DNN-HMM outperforms traditional n-gram LM approach significantly and achieves state-of-the-art performance.

Original language	English
Pages (from-to)	1527-1531
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	08-12-September-2016
DOIs	https://doi.org/10.21437/Interspeech.2016-873
State	Published - 2016
Event	17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States Duration: 8 Sep 2016 → 16 Sep 2016

Keywords

Deep neural network
Hidden Markov model
Story segmentation

Access to Document

10.21437/Interspeech.2016-873

Cite this

@article{2e88dc8b0f124e569ca47cfe8467296c,

title = "A DNN-HMM approach to story segmentation",

abstract = "Hidden Markov model (HMM) is one of the popular techniques for story segmentation, where hidden Markov states represent the topics, and the emission distributions of n-gram language model (LM) are dependent on the states. Given a text docu-ment, a Viterbi decoder finds the hidden story sequence, with a change of topic indicating a story boundary. In this paper, we propose a discriminative approach to story boundary detection. In the HMM framework, we use deep neural network (DNN) to estimate the posterior probability of topics given the bag-of-words in the local context. We call it the DNN-HMM approach. We consider the topic dependent LM as a generative modeling technique, and the DNN-HMM as the discriminative solution. Experiments on topic detection and tracking (TDT2) task show that DNN-HMM outperforms traditional n-gram LM approach significantly and achieves state-of-the-art performance.",

keywords = "Deep neural network, Hidden Markov model, Story segmentation",

author = "Jia Yu and Xiong Xiao and Lei Xie and Chng, {Eng Siong} and Haizhou Li",

note = "Publisher Copyright: Copyright {\textcopyright} 2016 ISCA.; 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 ; Conference date: 08-09-2016 Through 16-09-2016",

year = "2016",

doi = "10.21437/Interspeech.2016-873",

language = "英语",

volume = "08-12-September-2016",

pages = "1527--1531",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - A DNN-HMM approach to story segmentation

AU - Yu, Jia

AU - Xiao, Xiong

AU - Xie, Lei

AU - Chng, Eng Siong

AU - Li, Haizhou

PY - 2016

Y1 - 2016

N2 - Hidden Markov model (HMM) is one of the popular techniques for story segmentation, where hidden Markov states represent the topics, and the emission distributions of n-gram language model (LM) are dependent on the states. Given a text docu-ment, a Viterbi decoder finds the hidden story sequence, with a change of topic indicating a story boundary. In this paper, we propose a discriminative approach to story boundary detection. In the HMM framework, we use deep neural network (DNN) to estimate the posterior probability of topics given the bag-of-words in the local context. We call it the DNN-HMM approach. We consider the topic dependent LM as a generative modeling technique, and the DNN-HMM as the discriminative solution. Experiments on topic detection and tracking (TDT2) task show that DNN-HMM outperforms traditional n-gram LM approach significantly and achieves state-of-the-art performance.

AB - Hidden Markov model (HMM) is one of the popular techniques for story segmentation, where hidden Markov states represent the topics, and the emission distributions of n-gram language model (LM) are dependent on the states. Given a text docu-ment, a Viterbi decoder finds the hidden story sequence, with a change of topic indicating a story boundary. In this paper, we propose a discriminative approach to story boundary detection. In the HMM framework, we use deep neural network (DNN) to estimate the posterior probability of topics given the bag-of-words in the local context. We call it the DNN-HMM approach. We consider the topic dependent LM as a generative modeling technique, and the DNN-HMM as the discriminative solution. Experiments on topic detection and tracking (TDT2) task show that DNN-HMM outperforms traditional n-gram LM approach significantly and achieves state-of-the-art performance.

KW - Deep neural network

KW - Hidden Markov model

KW - Story segmentation

UR - http://www.scopus.com/inward/record.url?scp=84994275859&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2016-873

DO - 10.21437/Interspeech.2016-873

M3 - 会议文章

AN - SCOPUS:84994275859

SN - 2308-457X

VL - 08-12-September-2016

SP - 1527

EP - 1531

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016

Y2 - 8 September 2016 through 16 September 2016

ER -

A DNN-HMM approach to story segmentation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this