An asynchronous WFST-based decoder for automatic speech recognition

Hang Lv; Zhehuai Chen; Hainan Xu; Daniel Povey; Lei Xie; Sanjeev Khudanpur

doi:10.1109/ICASSP39728.2021.9414509

An asynchronous WFST-based decoder for automatic speech recognition

Hang Lv, Zhehuai Chen, Hainan Xu, Daniel Povey, Lei Xie, Sanjeev Khudanpur

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing “exploration” and the other “backfill”. The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.

Original language	English
Title of host publication	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	6019-6023
Number of pages	5
ISBN (Electronic)	9781728176055
DOIs	https://doi.org/10.1109/ICASSP39728.2021.9414509
State	Published - 2021
Event	2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada Duration: 6 Jun 2021 → 11 Jun 2021

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2021-June
ISSN (Print)	1520-6149

Conference

Conference	2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021
Country/Territory	Canada
City	Virtual, Toronto
Period	6/06/21 → 11/06/21

Keywords

Automatic speech recognition
Decoder
Lattice generation
Lattice pruning

Access to Document

10.1109/ICASSP39728.2021.9414509

Cite this

Lv, H., Chen, Z., Xu, H., Povey, D., Xie, L., & Khudanpur, S. (2021). An asynchronous WFST-based decoder for automatic speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 6019-6023). (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2021-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP39728.2021.9414509

Lv, Hang ; Chen, Zhehuai ; Xu, Hainan et al. / An asynchronous WFST-based decoder for automatic speech recognition. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2021. pp. 6019-6023 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{52c55e26cb9640cebd62f7d0dae80784,

title = "An asynchronous WFST-based decoder for automatic speech recognition",

abstract = "We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing “exploration” and the other “backfill”. The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.",

keywords = "Automatic speech recognition, Decoder, Lattice generation, Lattice pruning",

author = "Hang Lv and Zhehuai Chen and Hainan Xu and Daniel Povey and Lei Xie and Sanjeev Khudanpur",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE; 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 ; Conference date: 06-06-2021 Through 11-06-2021",

year = "2021",

doi = "10.1109/ICASSP39728.2021.9414509",

language = "英语",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "6019--6023",

booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

Lv, H, Chen, Z, Xu, H, Povey, D, Xie, L & Khudanpur, S 2021, An asynchronous WFST-based decoder for automatic speech recognition. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2021-June, Institute of Electrical and Electronics Engineers Inc., pp. 6019-6023, 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021, Virtual, Toronto, Canada, 6/06/21. https://doi.org/10.1109/ICASSP39728.2021.9414509

An asynchronous WFST-based decoder for automatic speech recognition. / Lv, Hang; Chen, Zhehuai; Xu, Hainan et al.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2021. p. 6019-6023 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2021-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - An asynchronous WFST-based decoder for automatic speech recognition

AU - Lv, Hang

AU - Chen, Zhehuai

AU - Xu, Hainan

AU - Povey, Daniel

AU - Xie, Lei

AU - Khudanpur, Sanjeev

PY - 2021

Y1 - 2021

N2 - We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing “exploration” and the other “backfill”. The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.

AB - We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing “exploration” and the other “backfill”. The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.

KW - Automatic speech recognition

KW - Decoder

KW - Lattice generation

KW - Lattice pruning

UR - http://www.scopus.com/inward/record.url?scp=85115071718&partnerID=8YFLogxK

U2 - 10.1109/ICASSP39728.2021.9414509

DO - 10.1109/ICASSP39728.2021.9414509

M3 - 会议稿件

AN - SCOPUS:85115071718

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 6019

EP - 6023

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021

Y2 - 6 June 2021 through 11 June 2021

ER -

Lv H, Chen Z, Xu H, Povey D, Xie L, Khudanpur S. An asynchronous WFST-based decoder for automatic speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2021. p. 6019-6023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP39728.2021.9414509

An asynchronous WFST-based decoder for automatic speech recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this