TY - GEN
T1 - The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge
AU - Liang, Yuhao
AU - Chen, Peikun
AU - Yu, Fan
AU - Zhu, Xinfa
AU - Xu, Tianyi
AU - Gao, Yingying
AU - Xie, Lei
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - This paper describes our NPU-ASLP system submitted to the ISCSLP 2022 Magichub Code-Switching ASR Challenge. In this challenge, we first explore several popular end-to-end multilingual ASR architectures and training strategies, including bi-encoder, language-aware encoder (LAE) and mixture of experts (MoE). To improve our system's language modeling ability, we further attempt the internal language model as well as the long context language model. Given the limited training data in the challenge, we further investigate data augmentation strategies, including speed perturbation, pitch shifting, speech codec, SpecAugment and synthetic data from text-to-speech (TTS). Finally, we explore ROVER-based score fusion to make full use of complementary hypotheses from different models. Our submitted system achieves 16.87% on mix error rate (MER) on the test set and comes to the 2nd place in the challenge ranking.
AB - This paper describes our NPU-ASLP system submitted to the ISCSLP 2022 Magichub Code-Switching ASR Challenge. In this challenge, we first explore several popular end-to-end multilingual ASR architectures and training strategies, including bi-encoder, language-aware encoder (LAE) and mixture of experts (MoE). To improve our system's language modeling ability, we further attempt the internal language model as well as the long context language model. Given the limited training data in the challenge, we further investigate data augmentation strategies, including speed perturbation, pitch shifting, speech codec, SpecAugment and synthetic data from text-to-speech (TTS). Finally, we explore ROVER-based score fusion to make full use of complementary hypotheses from different models. Our submitted system achieves 16.87% on mix error rate (MER) on the test set and comes to the 2nd place in the challenge ranking.
KW - Automatic Speech Recognition
KW - Code-Switching
KW - Data Augmentation
UR - http://www.scopus.com/inward/record.url?scp=85148638098&partnerID=8YFLogxK
U2 - 10.1109/ISCSLP57327.2022.10037962
DO - 10.1109/ISCSLP57327.2022.10037962
M3 - 会议稿件
AN - SCOPUS:85148638098
T3 - 2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
SP - 532
EP - 536
BT - 2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
A2 - Lee, Kong Aik
A2 - Lee, Hung-yi
A2 - Lu, Yanfeng
A2 - Dong, Minghui
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
Y2 - 11 December 2022 through 14 December 2022
ER -