Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news

Guangpu Huang; Chenglin Xu; Xiong Xiao; Lei Xie; Eng Siong Chng; Haizhou Li

doi:10.1109/APSIPA.2014.7041543

Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news

Guangpu Huang, Chenglin Xu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

5 Scopus citations

Abstract

This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.

Original language	English
Title of host publication	2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9786163618238
DOIs	https://doi.org/10.1109/APSIPA.2014.7041543
State	Published - 12 Feb 2014
Event	2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 - Chiang Mai, Thailand Duration: 9 Dec 2014 → 12 Dec 2014

Publication series

Name	2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

Conference

Conference	2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
Country/Territory	Thailand
City	Chiang Mai
Period	9/12/14 → 12/12/14

Access to Document

10.1109/APSIPA.2014.7041543

Cite this

Huang, G., Xu, C., Xiao, X., Xie, L., Chng, E. S., & Li, H. (2014). Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news. In 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 Article 7041543 (2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPA.2014.7041543

Huang, Guangpu ; Xu, Chenglin ; Xiao, Xiong et al. / Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news. 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014. Institute of Electrical and Electronics Engineers Inc., 2014. (2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014).

@inproceedings{4fcf6e6a02b849bcaf2e7faee8df0905,

title = "Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news",

abstract = "This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.",

author = "Guangpu Huang and Chenglin Xu and Xiong Xiao and Lei Xie and Chng, {Eng Siong} and Haizhou Li",

note = "Publisher Copyright: {\textcopyright} 2014 Asia-Pacific Signal and Information Processing Ass.; 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 ; Conference date: 09-12-2014 Through 12-12-2014",

year = "2014",

month = feb,

day = "12",

doi = "10.1109/APSIPA.2014.7041543",

language = "英语",

series = "2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014",

}

Huang, G, Xu, C, Xiao, X, Xie, L, Chng, ES & Li, H 2014, Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news. in 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014., 7041543, 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014, Institute of Electrical and Electronics Engineers Inc., 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014, Chiang Mai, Thailand, 9/12/14. https://doi.org/10.1109/APSIPA.2014.7041543

Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news. / Huang, Guangpu; Xu, Chenglin; Xiao, Xiong et al.
2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014. Institute of Electrical and Electronics Engineers Inc., 2014. 7041543 (2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news

AU - Huang, Guangpu

AU - Xu, Chenglin

AU - Xiao, Xiong

AU - Xie, Lei

AU - Chng, Eng Siong

AU - Li, Haizhou

PY - 2014/2/12

Y1 - 2014/2/12

N2 - This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.

AB - This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.

UR - http://www.scopus.com/inward/record.url?scp=84949925559&partnerID=8YFLogxK

U2 - 10.1109/APSIPA.2014.7041543

DO - 10.1109/APSIPA.2014.7041543

M3 - 会议稿件

AN - SCOPUS:84949925559

T3 - 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

BT - 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

Y2 - 9 December 2014 through 12 December 2014

ER -

Huang G, Xu C, Xiao X, Xie L, Chng ES, Li H. Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news. In 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014. Institute of Electrical and Electronics Engineers Inc. 2014. 7041543. (2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014). doi: 10.1109/APSIPA.2014.7041543

Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this