A Bidirectional LSTM Approach with Word Embeddings for Sentence Boundary Detection

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

Recovering sentence boundaries from speech and its transcripts is essential for readability and downstream speech and language processing tasks. In this paper, we propose to use deep recurrent neural network to detect sentence boundaries in broadcast news by modeling rich prosodic and lexical features extracted at each inter-word position. We introduce an unsupervised word embedding to represent word identity, learned from the Continuous Bag-of-Words (CBOW) model, into sentence boundary detection task as an effective feature. The word embedding contains syntactic information that is essential for this detection task. In addition, we propose another two low-dimensional word embeddings derived from a neural network that includes class and context information to represent words by supervised learning: one is extracted from the projection layer, the other one comes from the last hidden layer. Furthermore, we propose a deep bidirectional Long Short Term Memory (LSTM) based architecture with Viterbi decoding for sentence boundary detection. Under this framework, the long-range dependencies of prosodic and lexical information in temporal sequences are modeled effectively. Compared with previous state-of-the-art DNN-CRF method, the proposed LSTM approach reduces 24.8% and 9.8% relative NIST SU error in reference and recognition transcripts, respectively.

Original languageEnglish
Pages (from-to)1063-1075
Number of pages13
JournalJournal of Signal Processing Systems
Volume90
Issue number7
DOIs
StatePublished - 1 Jul 2018

Keywords

  • Long short-term memory
  • Recurrent neural network
  • Sentence boundary detection
  • Word embedding

Fingerprint

Dive into the research topics of 'A Bidirectional LSTM Approach with Word Embeddings for Sentence Boundary Detection'. Together they form a unique fingerprint.

Cite this