TY - JOUR
T1 - SharpFormer
T2 - Learning Local Feature Preserving Global Representations for Image Deblurring
AU - Yan, Qingsen
AU - Gong, Dong
AU - Wang, Pei
AU - Zhang, Zhen
AU - Zhang, Yanning
AU - Shi, Javen Qinfeng
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - The goal of dynamic scene deblurring is to remove the motion blur presented in a given image. To recover the details from the severe blurs, conventional convolutional neural networks (CNNs) based methods typically increase the number of convolution layers, kernel-size, or different scale images to enlarge the receptive field. However, these methods neglect the non-uniform nature of blurs, and cannot extract varied local and global information. Unlike the CNNs-based methods, we propose a Transformer-based model for image deblurring, named SharpFormer, that directly learns long-range dependencies via a novel Transformer module to overcome large blur variations. Transformer is good at learning global information but is poor at capturing local information. To overcome this issue, we design a novel Locality preserving Transformer (LTransformer) block to integrate sufficient local information into global features. In addition, to effectively apply LTransformer to the medium-resolution features, a hybrid block is introduced to capture intermediate mixed features. Furthermore, we use a dynamic convolution (DyConv) block, which aggregates multiple parallel convolution kernels to handle the non-uniform blur of inputs. We leverage a powerful two-stage attentive framework composed of the above blocks to learn the global, hybrid, and local features effectively. Extensive experiments on the GoPro and REDS datasets show that the proposed SharpFormer performs favourably against the state-of-the-art methods in blurred image restoration.
AB - The goal of dynamic scene deblurring is to remove the motion blur presented in a given image. To recover the details from the severe blurs, conventional convolutional neural networks (CNNs) based methods typically increase the number of convolution layers, kernel-size, or different scale images to enlarge the receptive field. However, these methods neglect the non-uniform nature of blurs, and cannot extract varied local and global information. Unlike the CNNs-based methods, we propose a Transformer-based model for image deblurring, named SharpFormer, that directly learns long-range dependencies via a novel Transformer module to overcome large blur variations. Transformer is good at learning global information but is poor at capturing local information. To overcome this issue, we design a novel Locality preserving Transformer (LTransformer) block to integrate sufficient local information into global features. In addition, to effectively apply LTransformer to the medium-resolution features, a hybrid block is introduced to capture intermediate mixed features. Furthermore, we use a dynamic convolution (DyConv) block, which aggregates multiple parallel convolution kernels to handle the non-uniform blur of inputs. We leverage a powerful two-stage attentive framework composed of the above blocks to learn the global, hybrid, and local features effectively. Extensive experiments on the GoPro and REDS datasets show that the proposed SharpFormer performs favourably against the state-of-the-art methods in blurred image restoration.
KW - Deblurring
KW - global information
KW - locality preserving
KW - long-range dependencies
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=85160273371&partnerID=8YFLogxK
U2 - 10.1109/TIP.2023.3251029
DO - 10.1109/TIP.2023.3251029
M3 - 文章
C2 - 37186531
AN - SCOPUS:85160273371
SN - 1057-7149
VL - 32
SP - 2857
EP - 2866
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -