TY - JOUR
T1 - Broadcast news story segmentation using conditional random fields and multimodal features
AU - Wang, Xiaoxuan
AU - Xie, Lei
AU - Lu, Mimi
AU - Ma, Bin
AU - Chng, Eng Siong
AU - Li, Haizhou
PY - 2012/5
Y1 - 2012/5
N2 - In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.
AB - In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.
KW - Conditional random fields
KW - Story segmentation
UR - http://www.scopus.com/inward/record.url?scp=84860630137&partnerID=8YFLogxK
U2 - 10.1587/transinf.E95.D.1206
DO - 10.1587/transinf.E95.D.1206
M3 - 文章
AN - SCOPUS:84860630137
SN - 0916-8532
VL - E95-D
SP - 1206
EP - 1215
JO - IEICE Transactions on Information and Systems
JF - IEICE Transactions on Information and Systems
IS - 5
ER -