Incorporation of a modified temporal cepstrum smoothing in both signal-to-noise ratio and speech presence probability estimation for speech enhancement

Dahan Wang, Zhongshu Hou, Yuxiang Hu, Changbao Zhu, Jing Lu, Jingdong Chen

科研成果: 期刊稿件文章同行评审

摘要

Numerous advanced and lightweight signal processing methods have been presented for single-channel speech enhancement (SE). It is imperative to carefully explore how to efficiently combine, integrate, and balance these methods. This paper proposes a more effective and less resource-intensive SE system, focused on the integration and adaptation of several approaches, especially the temporal cepstrum smoothing (TCS). First, a more robust fundamental frequency estimator is employed within TCS, mitigating the performance limitations caused by the inaccuracy of the original estimator. Additionally, a harmonic enhancement mechanism is introduced, effectively recovering the weak harmonic components. By incorporation of the modified TCS in the a posteriori speech presence probability estimation, the unbiased minimum mean square error noise power spectral density estimator can be refined. The modified TCS is also utilized for the a priori signal-to-noise ratio estimation. Moreover, this paper enhances the log-spectral amplitude estimator by applying both super-Gaussian speech priors and speech presence uncertainty for further improvement. Experimental evaluations demonstrate that the proposed method yields an improvement in speech quality while maintaining modest computational and storage requirements. Furthermore, the proposed system exhibits comparable performance to several baseline systems based on lightweight deep neural networks.

源语言英语
页(从-至)3678-3689
页数12
期刊Journal of the Acoustical Society of America
155
6
DOI
出版状态已出版 - 1 6月 2024

指纹

探究 'Incorporation of a modified temporal cepstrum smoothing in both signal-to-noise ratio and speech presence probability estimation for speech enhancement' 的科研主题。它们共同构成独一无二的指纹。

引用此