LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

Hang Lv; Daniel Povey; Mahsa Yarmohammadi; Ke Li; Yiming Wang; Lei Xie; Sanjeev Khudanpur

doi:10.1109/LSP.2021.3067220

LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

Hang Lv, Daniel Povey, Mahsa Yarmohammadi, Ke Li, Yiming Wang, Lei Xie, Sanjeev Khudanpur

School of Computer Science

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.

Original language	English
Article number	9381702
Pages (from-to)	703-707
Number of pages	5
Journal	IEEE Signal Processing Letters
Volume	28
DOIs	https://doi.org/10.1109/LSP.2021.3067220
State	Published - 2021

Keywords

On-the-fly composition
on-the-fly lattice rescoring
speech recognition
WFST

Access to Document

10.1109/LSP.2021.3067220

Cite this

@article{2abd8d15f01244409981b808c0e8701f,

title = "LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation",

abstract = "We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.",

keywords = "On-the-fly composition, on-the-fly lattice rescoring, speech recognition, WFST",

author = "Hang Lv and Daniel Povey and Mahsa Yarmohammadi and Ke Li and Yiming Wang and Lei Xie and Sanjeev Khudanpur",

note = "Publisher Copyright: {\textcopyright} 1994-2012 IEEE.",

year = "2021",

doi = "10.1109/LSP.2021.3067220",

language = "英语",

volume = "28",

pages = "703--707",

journal = "IEEE Signal Processing Letters",

issn = "1070-9908",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - LET-Decoder

T2 - A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

AU - Lv, Hang

AU - Povey, Daniel

AU - Yarmohammadi, Mahsa

AU - Li, Ke

AU - Wang, Yiming

AU - Xie, Lei

AU - Khudanpur, Sanjeev

PY - 2021

Y1 - 2021

N2 - We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.

AB - We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.

KW - On-the-fly composition

KW - on-the-fly lattice rescoring

KW - speech recognition

KW - WFST

UR - http://www.scopus.com/inward/record.url?scp=85103239650&partnerID=8YFLogxK

U2 - 10.1109/LSP.2021.3067220

DO - 10.1109/LSP.2021.3067220

M3 - 文章

AN - SCOPUS:85103239650

SN - 1070-9908

VL - 28

SP - 703

EP - 707

JO - IEEE Signal Processing Letters

JF - IEEE Signal Processing Letters

M1 - 9381702

ER -

LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this