Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news

Guangpu Huang, Chenglin Xu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.

Original languageEnglish
Title of host publication2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9786163618238
DOIs
StatePublished - 12 Feb 2014
Event2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 - Chiang Mai, Thailand
Duration: 9 Dec 201412 Dec 2014

Publication series

Name2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

Conference

Conference2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
Country/TerritoryThailand
CityChiang Mai
Period9/12/1412/12/14

Fingerprint

Dive into the research topics of 'Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news'. Together they form a unique fingerprint.

Cite this