Broadcast news story segmentation using conditional random fields and multimodal features

Xiaoxuan Wang; Lei Xie; Mimi Lu; Bin Ma; Eng Siong Chng; Haizhou Li

doi:10.1587/transinf.E95.D.1206

Broadcast news story segmentation using conditional random fields and multimodal features

Xiaoxuan Wang, Lei Xie, Mimi Lu, Bin Ma, Eng Siong Chng, Haizhou Li

School of Computer Science

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.

Original language	English
Pages (from-to)	1206-1215
Number of pages	10
Journal	IEICE Transactions on Information and Systems
Volume	E95-D
Issue number	5
DOIs	https://doi.org/10.1587/transinf.E95.D.1206
State	Published - May 2012

Keywords

Conditional random fields
Story segmentation

Access to Document

10.1587/transinf.E95.D.1206

Cite this

@article{e7c263938a9340f5a8479d73ff5fb511,

title = "Broadcast news story segmentation using conditional random fields and multimodal features",

abstract = "In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.",

keywords = "Conditional random fields, Story segmentation",

author = "Xiaoxuan Wang and Lei Xie and Mimi Lu and Bin Ma and Chng, {Eng Siong} and Haizhou Li",

year = "2012",

month = may,

doi = "10.1587/transinf.E95.D.1206",

language = "英语",

volume = "E95-D",

pages = "1206--1215",

journal = "IEICE Transactions on Information and Systems",

issn = "0916-8532",

publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",

number = "5",

}

TY - JOUR

T1 - Broadcast news story segmentation using conditional random fields and multimodal features

AU - Wang, Xiaoxuan

AU - Xie, Lei

AU - Lu, Mimi

AU - Ma, Bin

AU - Chng, Eng Siong

AU - Li, Haizhou

PY - 2012/5

Y1 - 2012/5

N2 - In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.

AB - In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.

KW - Conditional random fields

KW - Story segmentation

UR - http://www.scopus.com/inward/record.url?scp=84860630137&partnerID=8YFLogxK

U2 - 10.1587/transinf.E95.D.1206

DO - 10.1587/transinf.E95.D.1206

M3 - 文章

AN - SCOPUS:84860630137

SN - 0916-8532

VL - E95-D

SP - 1206

EP - 1215

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

IS - 5

ER -

Broadcast news story segmentation using conditional random fields and multimodal features

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this