TY - GEN
T1 - Integrating acoustic and lexical features in topic segmentation of Chinese broadcast news using maximum entropy approach
AU - Xie, Lei
AU - Yang, Yulian
AU - Liu, Zhi Qiang
AU - Feng, Wei
AU - Liu, Zihan
PY - 2010
Y1 - 2010
N2 - This paper studies how to integrate multi-modal features in automatic topic segmentation of Mandarin broadcast news. The multi-modal feature integration problem is formulated within the Maximum Entropy (MaxEnt) scheme for topic boundary classification by maximizing the entropy and respecting all known constraints (i.e., multiple features contributions). We particularly consider two types of features: (1) acoustic features, which reflect the editorial prosody of broadcast news, including pause duration, speaker change and speech type; and (2) lexical features extracted from speech recognition transcripts, which capture the semantic shifts of topics, including two local cohesiveness features and a new boundary indicator based on overall cohesiveness. Compared to local lexical features, the new overall cohesiveness feature maximizes the lexical cohesiveness of all topic fragments and reflects the fact that topic transitions in broadcast news are smooth and the distributional variations are subtle. Experiments show apparent performance improvement in topic segmentation of Chinese broadcast news by fusing acoustic and lexical features within the MaxEnt scheme.
AB - This paper studies how to integrate multi-modal features in automatic topic segmentation of Mandarin broadcast news. The multi-modal feature integration problem is formulated within the Maximum Entropy (MaxEnt) scheme for topic boundary classification by maximizing the entropy and respecting all known constraints (i.e., multiple features contributions). We particularly consider two types of features: (1) acoustic features, which reflect the editorial prosody of broadcast news, including pause duration, speaker change and speech type; and (2) lexical features extracted from speech recognition transcripts, which capture the semantic shifts of topics, including two local cohesiveness features and a new boundary indicator based on overall cohesiveness. Compared to local lexical features, the new overall cohesiveness feature maximizes the lexical cohesiveness of all topic fragments and reflects the fact that topic transitions in broadcast news are smooth and the distributional variations are subtle. Experiments show apparent performance improvement in topic segmentation of Chinese broadcast news by fusing acoustic and lexical features within the MaxEnt scheme.
UR - http://www.scopus.com/inward/record.url?scp=79851492942&partnerID=8YFLogxK
U2 - 10.1109/ICALIP.2010.5684551
DO - 10.1109/ICALIP.2010.5684551
M3 - 会议稿件
AN - SCOPUS:79851492942
SN - 9781424458653
T3 - ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings
SP - 407
EP - 413
BT - ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings
T2 - 2010 International Conference on Audio, Language and Image Processing, ICALIP 2010
Y2 - 23 November 2010 through 25 November 2010
ER -