跳到主要导航 跳到搜索 跳到主要内容

On Multi-input Multi-frame MVDR Filter for Speech Enhancement with Heterophasic Presentation

  • Zixuan Chen
  • , Hanchen Pei
  • , Jilu Jin
  • , Xueqin Luo
  • , Ningning Pan
  • , Gongping Huang
  • , Jingdong Chen
  • , Jacob Benesty
  • Wuhan University
  • Northwestern Polytechnical University Xian
  • Southwestern University of Finance and Economics
  • Institut national de la recherche scientifique

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Multi-channel speech enhancement attempts to recover a target speech signal from noisy observations by exploiting spatial information captured by a microphone array. Conventional approaches typically produce a single output that contains both the desired speech and some residual noise, which neglects the benefits of human binaural hearing system. To overcome this limitation, we propose in this work a multi-input multi-frame binaural-output (MIMFBO) noise reduction method operating in the short-time-Fourier-transform (STFT) domain. This method utilizes both inter-channel and inter-frame correlations to design binaural filters that maximize the interaural coherence (IC) of the desired speech signal while minimizing the IC of the noise, all under distortionless constraints for the desired target speech. As a result, the perceived target signal and residual noise are spatially separated, substantially enhancing speech intelligibility. Simulation results demonstrate the proposed method’s superiority, showing significant improvements in PESQ scores over both the single-input binaural-output MVDR and multi-input binaural-output MVDR approaches. Moreover, subjective listening tests confirm its perceptual benefits.

源语言英语
主期刊名Man-Machine Speech Communication - 20th National Conference, NCMMSC 2025, Proceedings
编辑Jia Jia, Zhiyong Wu, Lijian Gao, Gongping Huang, Ya Li
出版商Springer Science and Business Media Deutschland GmbH
408-421
页数14
ISBN(印刷版)9789819553815
DOI
出版状态已出版 - 2026
活动20th National Conference on Man-Machine Speech Communication, NCMMSC 2025 - Zhenjiang, 中国
期限: 16 10月 202519 10月 2025

出版系列

姓名Communications in Computer and Information Science
2662 CCIS
ISSN(印刷版)1865-0929
ISSN(电子版)1865-0937

会议

会议20th National Conference on Man-Machine Speech Communication, NCMMSC 2025
国家/地区中国
Zhenjiang
时期16/10/2519/10/25

指纹

探究 'On Multi-input Multi-frame MVDR Filter for Speech Enhancement with Heterophasic Presentation' 的科研主题。它们共同构成独一无二的指纹。

引用此