Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features

Chenglin Xu, Lei Xie, Zhonghua Fu

科研成果: 书/报告/会议事项章节会议稿件同行评审

5 引用 (Scopus)

摘要

This paper studies the use of condition random fields (CRF) and prosodic features for sentence boundary detection in Chinese broadcast news. Previous approaches mostly use first-order CRF and ignore the important context and sequential information. In this paper, we explore high-order CRF models to fully make use of the contextual and sequential information. Moreover, we show the effectiveness of CRF in sentence boundary detection by comparing it with various competitive models. The prosodic feature set is usually designed to be as exhaustive as possible in previous approaches. As a result, features may be highly correlated and some of them may be not effective. In this paper, we use a correlation-based feature selection method to select a subset with the most useful features. Finally, the use of the prosodic features, e.g., pitch, in Chinese sentence segmentation deserves further investigation because the tonal aspect of Chinese may complicate the expressions of pitch features. In this paper, we study the effectiveness of the prosodic features and rank their importance by an analysis of feature usage.

源语言英语
主期刊名2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
37-41
页数5
ISBN(电子版)9781479954032
DOI
出版状态已出版 - 3 9月 2014
活动2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Xi'an, 中国
期限: 9 7月 201413 7月 2014

出版系列

姓名2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings

会议

会议2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014
国家/地区中国
Xi'an
时期9/07/1413/07/14

指纹

探究 'Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features' 的科研主题。它们共同构成独一无二的指纹。

引用此