Cascaded transformer U-net for image restoration

Longbin Yan; Min Zhao; Shumin Liu; Shuaikai Shi; Jie Chen

doi:10.1016/j.sigpro.2022.108902

Cascaded transformer U-net for image restoration

Longbin Yan, Min Zhao, Shumin Liu, Shuaikai Shi, Jie Chen

航海学院

科研成果: 期刊稿件 › 文章 › 同行评审

25 引用（Scopus）

摘要

Image restoration is one of the most important computer vision tasks, aiming at recovering high-quality images from degraded or low-quality observations. The restoration methods based on convolutional neural networks (CNNs) have achieved attractive performance, however, as convolutions only intake local information, CNN-based methods have limitations in modeling objects in long ranges and extracting global information. In addition, existing one-stage methods damage the performance due to lacking diversified receptive fields. In this paper, we propose a multi-stage cascaded transformer architecture for image restoration. Firstly, the Swin transformer based encoder relying on self-attention is used to improve the modeling ability for long-range objects and outputs hierarchical multi-level semantic features. Then, a shape perceiving module is designed and embedded in the decoder to enhance the representation of irregular objects, Moreover, a multi-stage cascaded encoder-decoder architecture possessing diversified receptive fields is proposed to progressively obtain fine restoration results and thus boost the performance. We conduct extensive experiments, including image deraining, underwater image enhancement, near infrared image colorization and low-light image enhancement. The results show that our proposed method can achieve comparable or better performance than state-of-the-art methods while with less training and inference costs.

源语言	英语
文章编号	108902
期刊	Signal Processing
卷	206
DOI	https://doi.org/10.1016/j.sigpro.2022.108902
出版状态	已出版 - 5月 2023

访问文件

10.1016/j.sigpro.2022.108902

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{6617300b0cb74708b0790de3f0d42905,

title = "Cascaded transformer U-net for image restoration",

abstract = "Image restoration is one of the most important computer vision tasks, aiming at recovering high-quality images from degraded or low-quality observations. The restoration methods based on convolutional neural networks (CNNs) have achieved attractive performance, however, as convolutions only intake local information, CNN-based methods have limitations in modeling objects in long ranges and extracting global information. In addition, existing one-stage methods damage the performance due to lacking diversified receptive fields. In this paper, we propose a multi-stage cascaded transformer architecture for image restoration. Firstly, the Swin transformer based encoder relying on self-attention is used to improve the modeling ability for long-range objects and outputs hierarchical multi-level semantic features. Then, a shape perceiving module is designed and embedded in the decoder to enhance the representation of irregular objects, Moreover, a multi-stage cascaded encoder-decoder architecture possessing diversified receptive fields is proposed to progressively obtain fine restoration results and thus boost the performance. We conduct extensive experiments, including image deraining, underwater image enhancement, near infrared image colorization and low-light image enhancement. The results show that our proposed method can achieve comparable or better performance than state-of-the-art methods while with less training and inference costs.",

keywords = "Encoder-decoder structure, Image deraining, Long-range dependence modeling, Near infrared image colorization, Underwater image enhancement",

author = "Longbin Yan and Min Zhao and Shumin Liu and Shuaikai Shi and Jie Chen",

note = "Publisher Copyright: {\textcopyright} 2022",

year = "2023",

month = may,

doi = "10.1016/j.sigpro.2022.108902",

language = "英语",

volume = "206",

journal = "Signal Processing",

issn = "0165-1684",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Cascaded transformer U-net for image restoration

AU - Yan, Longbin

AU - Zhao, Min

AU - Liu, Shumin

AU - Shi, Shuaikai

AU - Chen, Jie

PY - 2023/5

Y1 - 2023/5

N2 - Image restoration is one of the most important computer vision tasks, aiming at recovering high-quality images from degraded or low-quality observations. The restoration methods based on convolutional neural networks (CNNs) have achieved attractive performance, however, as convolutions only intake local information, CNN-based methods have limitations in modeling objects in long ranges and extracting global information. In addition, existing one-stage methods damage the performance due to lacking diversified receptive fields. In this paper, we propose a multi-stage cascaded transformer architecture for image restoration. Firstly, the Swin transformer based encoder relying on self-attention is used to improve the modeling ability for long-range objects and outputs hierarchical multi-level semantic features. Then, a shape perceiving module is designed and embedded in the decoder to enhance the representation of irregular objects, Moreover, a multi-stage cascaded encoder-decoder architecture possessing diversified receptive fields is proposed to progressively obtain fine restoration results and thus boost the performance. We conduct extensive experiments, including image deraining, underwater image enhancement, near infrared image colorization and low-light image enhancement. The results show that our proposed method can achieve comparable or better performance than state-of-the-art methods while with less training and inference costs.

AB - Image restoration is one of the most important computer vision tasks, aiming at recovering high-quality images from degraded or low-quality observations. The restoration methods based on convolutional neural networks (CNNs) have achieved attractive performance, however, as convolutions only intake local information, CNN-based methods have limitations in modeling objects in long ranges and extracting global information. In addition, existing one-stage methods damage the performance due to lacking diversified receptive fields. In this paper, we propose a multi-stage cascaded transformer architecture for image restoration. Firstly, the Swin transformer based encoder relying on self-attention is used to improve the modeling ability for long-range objects and outputs hierarchical multi-level semantic features. Then, a shape perceiving module is designed and embedded in the decoder to enhance the representation of irregular objects, Moreover, a multi-stage cascaded encoder-decoder architecture possessing diversified receptive fields is proposed to progressively obtain fine restoration results and thus boost the performance. We conduct extensive experiments, including image deraining, underwater image enhancement, near infrared image colorization and low-light image enhancement. The results show that our proposed method can achieve comparable or better performance than state-of-the-art methods while with less training and inference costs.

KW - Encoder-decoder structure

KW - Image deraining

KW - Long-range dependence modeling

KW - Near infrared image colorization

KW - Underwater image enhancement

UR - http://www.scopus.com/inward/record.url?scp=85145973736&partnerID=8YFLogxK

U2 - 10.1016/j.sigpro.2022.108902

DO - 10.1016/j.sigpro.2022.108902

M3 - 文章

AN - SCOPUS:85145973736

SN - 0165-1684

VL - 206

JO - Signal Processing

JF - Signal Processing

M1 - 108902

ER -

Cascaded transformer U-net for image restoration

摘要

访问文件

其它文件与链接

指纹

引用此