TY - JOUR
T1 - Cascaded transformer U-net for image restoration
AU - Yan, Longbin
AU - Zhao, Min
AU - Liu, Shumin
AU - Shi, Shuaikai
AU - Chen, Jie
N1 - Publisher Copyright:
© 2022
PY - 2023/5
Y1 - 2023/5
N2 - Image restoration is one of the most important computer vision tasks, aiming at recovering high-quality images from degraded or low-quality observations. The restoration methods based on convolutional neural networks (CNNs) have achieved attractive performance, however, as convolutions only intake local information, CNN-based methods have limitations in modeling objects in long ranges and extracting global information. In addition, existing one-stage methods damage the performance due to lacking diversified receptive fields. In this paper, we propose a multi-stage cascaded transformer architecture for image restoration. Firstly, the Swin transformer based encoder relying on self-attention is used to improve the modeling ability for long-range objects and outputs hierarchical multi-level semantic features. Then, a shape perceiving module is designed and embedded in the decoder to enhance the representation of irregular objects, Moreover, a multi-stage cascaded encoder-decoder architecture possessing diversified receptive fields is proposed to progressively obtain fine restoration results and thus boost the performance. We conduct extensive experiments, including image deraining, underwater image enhancement, near infrared image colorization and low-light image enhancement. The results show that our proposed method can achieve comparable or better performance than state-of-the-art methods while with less training and inference costs.
AB - Image restoration is one of the most important computer vision tasks, aiming at recovering high-quality images from degraded or low-quality observations. The restoration methods based on convolutional neural networks (CNNs) have achieved attractive performance, however, as convolutions only intake local information, CNN-based methods have limitations in modeling objects in long ranges and extracting global information. In addition, existing one-stage methods damage the performance due to lacking diversified receptive fields. In this paper, we propose a multi-stage cascaded transformer architecture for image restoration. Firstly, the Swin transformer based encoder relying on self-attention is used to improve the modeling ability for long-range objects and outputs hierarchical multi-level semantic features. Then, a shape perceiving module is designed and embedded in the decoder to enhance the representation of irregular objects, Moreover, a multi-stage cascaded encoder-decoder architecture possessing diversified receptive fields is proposed to progressively obtain fine restoration results and thus boost the performance. We conduct extensive experiments, including image deraining, underwater image enhancement, near infrared image colorization and low-light image enhancement. The results show that our proposed method can achieve comparable or better performance than state-of-the-art methods while with less training and inference costs.
KW - Encoder-decoder structure
KW - Image deraining
KW - Long-range dependence modeling
KW - Near infrared image colorization
KW - Underwater image enhancement
UR - http://www.scopus.com/inward/record.url?scp=85145973736&partnerID=8YFLogxK
U2 - 10.1016/j.sigpro.2022.108902
DO - 10.1016/j.sigpro.2022.108902
M3 - 文章
AN - SCOPUS:85145973736
SN - 0165-1684
VL - 206
JO - Signal Processing
JF - Signal Processing
M1 - 108902
ER -