Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper

Tianyi Xu, Kaixun Huang, Pengcheng Guo, Yu Zhou, Longtao Huang, Hui Xue, Lei Xie

科研成果: 期刊稿件会议文章同行评审

摘要

Pre-trained multilingual speech foundation models, like Whisper, have shown impressive performance across different languages.However, adapting these models to new or specific languages is computationally extensive and faces catastrophic forgetting problems.Addressing these issues, our study investigates strategies to enhance the model on new languages in the absence of original training data, while also preserving the established performance on the original languages.Specifically, we first compare various LoRA-based methods to find out their vulnerability to forgetting.To mitigate this issue, we propose to leverage the LoRA parameters from the original model for approximate orthogonal gradient descent on the new samples.Additionally, we also introduce a learnable rank coefficient to allocate trainable parameters for more efficient training.Our experiments with a Chinese Whisper model (for Uyghur and Tibetan) yield better results with a more compact parameter set.

源语言英语
页(从-至)2534-2538
页数5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOI
出版状态已出版 - 2024
活动25th Interspeech Conferece 2024 - Kos Island, 希腊
期限: 1 9月 20245 9月 2024

指纹

探究 'Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper' 的科研主题。它们共同构成独一无二的指纹。

引用此