Efficient conformer with prob-sparse attention mechanism for end-to-end speech recognition

Xiong Wang, Sining Sun, Lei Xie, Long Ma

科研成果: 书/报告/会议事项章节会议稿件同行评审

13 引用 (Scopus)

摘要

End-to-end models are favored in automatic speech recognition (ASR) because of their simplified system structure and superi- or performance. Among these models, Transformer and Conformer have achieved state-of-the-art recognition accuracy in which self-attention plays a vital role in capturing important global information. However, the time and memory complexity of self-attention increases squarely with the length of the sentence. In this paper, a prob-sparse self-attention mechanism is introduced into Conformer to sparse the computing process of self-attention in order to accelerate inference speed and reduce space consumption. Specifically, we adopt a Kullback-Leibler divergence based sparsity measurement for each query to decide whether we compute the attention function on this query. By using the prob-sparse attention mechanism, we achieve impressively 8% to 45% inference speed-up and 15% to 45% memory usage reduction of the self-attention module of Conformer Transducer while maintaining the same level of error rate.

源语言英语
主期刊名22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
出版商International Speech Communication Association
1898-1902
页数5
ISBN(电子版)9781713836902
DOI
出版状态已出版 - 2021
活动22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, 捷克共和国
期限: 30 8月 20213 9月 2021

出版系列

姓名Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
3
ISSN(印刷版)2308-457X
ISSN(电子版)1990-9772

会议

会议22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
国家/地区捷克共和国
Brno
时期30/08/213/09/21

指纹

探究 'Efficient conformer with prob-sparse attention mechanism for end-to-end speech recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此