TY - JOUR
T1 - Robust Dereverberation with Kronecker Product Based Multichannel Linear Prediction
AU - Yang, Wenxing
AU - Huang, Gongping
AU - Chen, Jingdong
AU - Benesty, Jacob
AU - Cohen, Israel
AU - Kellermann, Walter
N1 - Publisher Copyright:
© 1994-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Reverberation impairs not only the speech quality, but also intelligibility. The weighted-prediction-error (WPE) method, which estimates the late reverberation component based on a multichannel linear predictor, is by far one of the most effective algorithms for dereverberation. Generally, the WPE prediction filter in every short-Time-Fourier-Transform (STFT) subband has to be long enough to estimate accurately the late reverberation component. As a consequence, WPE is computationally expensive, which makes it difficult to implement into real-Time embedded or edge computing devices. Moreover, WPE is sensitive to additive noise and its performance may suffer from dramatic degradation even in environments where the signal-To-noise ratio (SNR) is high. To address these drawbacks, this letter proposes to decompose the multichannel linear prediction filter as a Kronecker product of a temporal (interframe) prediction filter and a spatial filter. An iterative algorithm is then developed to optimize the two filters. In comparison with the original WPE algorithm, the presented method not only exhibits better performance in terms of dereverberation and robustness to additive noise, as there are fewer parameters to estimate for a given number of observation signal samples, but is also computationally more efficient, since the dimensions of the covariance matrices after Kronecker product decomposition are smaller.
AB - Reverberation impairs not only the speech quality, but also intelligibility. The weighted-prediction-error (WPE) method, which estimates the late reverberation component based on a multichannel linear predictor, is by far one of the most effective algorithms for dereverberation. Generally, the WPE prediction filter in every short-Time-Fourier-Transform (STFT) subband has to be long enough to estimate accurately the late reverberation component. As a consequence, WPE is computationally expensive, which makes it difficult to implement into real-Time embedded or edge computing devices. Moreover, WPE is sensitive to additive noise and its performance may suffer from dramatic degradation even in environments where the signal-To-noise ratio (SNR) is high. To address these drawbacks, this letter proposes to decompose the multichannel linear prediction filter as a Kronecker product of a temporal (interframe) prediction filter and a spatial filter. An iterative algorithm is then developed to optimize the two filters. In comparison with the original WPE algorithm, the presented method not only exhibits better performance in terms of dereverberation and robustness to additive noise, as there are fewer parameters to estimate for a given number of observation signal samples, but is also computationally more efficient, since the dimensions of the covariance matrices after Kronecker product decomposition are smaller.
KW - Beamforming
KW - dereverberation
KW - Kronecker product filter
KW - noise robustness
KW - speech enhancement
KW - weighted-prediction-error
UR - http://www.scopus.com/inward/record.url?scp=85098759062&partnerID=8YFLogxK
U2 - 10.1109/LSP.2020.3044796
DO - 10.1109/LSP.2020.3044796
M3 - 文章
AN - SCOPUS:85098759062
SN - 1070-9908
VL - 28
SP - 101
EP - 105
JO - IEEE Signal Processing Letters
JF - IEEE Signal Processing Letters
M1 - 9293360
ER -