LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

Hang Lv; Daniel Povey; Mahsa Yarmohammadi; Ke Li; Yiming Wang; Lei Xie; Sanjeev Khudanpur

doi:10.1109/LSP.2021.3067220

LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

Hang Lv, Daniel Povey, Mahsa Yarmohammadi, Ke Li, Yiming Wang, Lei Xie, Sanjeev Khudanpur

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.

源语言	英语
文章编号	9381702
页（从-至）	703-707
页数	5
期刊	IEEE Signal Processing Letters
卷	28
DOI	https://doi.org/10.1109/LSP.2021.3067220
出版状态	已出版 - 2021

访问文件

10.1109/LSP.2021.3067220

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{2abd8d15f01244409981b808c0e8701f,

title = "LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation",

abstract = "We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.",

keywords = "On-the-fly composition, on-the-fly lattice rescoring, speech recognition, WFST",

author = "Hang Lv and Daniel Povey and Mahsa Yarmohammadi and Ke Li and Yiming Wang and Lei Xie and Sanjeev Khudanpur",

note = "Publisher Copyright: {\textcopyright} 1994-2012 IEEE.",

year = "2021",

doi = "10.1109/LSP.2021.3067220",

language = "英语",

volume = "28",

pages = "703--707",

journal = "IEEE Signal Processing Letters",

issn = "1070-9908",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - LET-Decoder

T2 - A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

AU - Lv, Hang

AU - Povey, Daniel

AU - Yarmohammadi, Mahsa

AU - Li, Ke

AU - Wang, Yiming

AU - Xie, Lei

AU - Khudanpur, Sanjeev

PY - 2021

Y1 - 2021

N2 - We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.

AB - We propose a novel lazy-evaluation token-group decoding algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large vocabulary continuous speech recognition. In the standard on-the-fly composition decoder, a base WFST and one or more incremental WFSTs are composed during decoding, and then token passing algorithm is employed to generate the lattice on the composed search space, resulting in substantial computation overhead. To improve speed, the proposed algorithm adopts 1) a token-group method, which groups tokens with the same state in the base WFST on each frame and limits the capacity of the group and 2) a lazy-evaluation method, which does not expand a token group and its source token groups until it processes a word label during decoding. Experiments show that the proposed decoder works notably up to 3 times faster than the standard on-the-fly composition decoder.

KW - On-the-fly composition

KW - on-the-fly lattice rescoring

KW - speech recognition

KW - WFST

UR - http://www.scopus.com/inward/record.url?scp=85103239650&partnerID=8YFLogxK

U2 - 10.1109/LSP.2021.3067220

DO - 10.1109/LSP.2021.3067220

M3 - 文章

AN - SCOPUS:85103239650

SN - 1070-9908

VL - 28

SP - 703

EP - 707

JO - IEEE Signal Processing Letters

JF - IEEE Signal Processing Letters

M1 - 9381702

ER -

LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder with Exact Lattice Generation

摘要

访问文件

其它文件与链接

指纹

引用此