TY - JOUR
T1 - From Patch to Pixel
T2 - A Transformer-Based Hierarchical Framework for Compressive Image Sensing
AU - Gan, Hongping
AU - Shen, Minghe
AU - Hua, Yi
AU - Ma, Chunyan
AU - Zhang, Tao
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2023
Y1 - 2023
N2 - The convolutional neural network (CNN)-based reconstruction methods have dominated the compressive sensing (CS) in recent years. However, existing CNN-based approaches show potential restrictions in capturing non-local similarity of images, because of the intrinsic characteristic of convolutional layers, $\mathit{i.e.}$, locality and weight sharing. In parallel, the emerging Transformer architecture shows fine capacity in modeling long-distance correlations onto embedded tokens for language and images. Yet vanilla Transformer does not exceed CNN-based networks considerably but shows roughly comparable performance, and the culprit can be the missing of sophisticated inductive bias regarding the local image structures. In this article, to eliminate the restrictions of the aforementioned paradigms, we propose a Transformer-based hierarchical framework, dubbed TCS-Net, for compressive image sensing (or image compressive sensing) with a $\mathit{patch-to-pixel}$ manner. Concretely, the proposed TCS-Net consists of an image acquisition module and a reconstruction module (includes two key decoding phases: a patch-wise decoding phase and a pixel-wise decoding phase). The acquisition module can implement data-driven image sampling by jointly learning with the decoding phases. By adjusting the Transformer architecture to the $\mathit{patch-to-pixel}$ multi-stage pattern, our reconstruction module can gradually decode the CS measurements from the patch-wise outlines to the pixel-wise textures, thereby building a high-precision mapping for image reconstruction. Extensive experiments on several datasets verify that the proposed TCS-Net outperforms existing state-of-the-art image CS methods by considerable margins.
AB - The convolutional neural network (CNN)-based reconstruction methods have dominated the compressive sensing (CS) in recent years. However, existing CNN-based approaches show potential restrictions in capturing non-local similarity of images, because of the intrinsic characteristic of convolutional layers, $\mathit{i.e.}$, locality and weight sharing. In parallel, the emerging Transformer architecture shows fine capacity in modeling long-distance correlations onto embedded tokens for language and images. Yet vanilla Transformer does not exceed CNN-based networks considerably but shows roughly comparable performance, and the culprit can be the missing of sophisticated inductive bias regarding the local image structures. In this article, to eliminate the restrictions of the aforementioned paradigms, we propose a Transformer-based hierarchical framework, dubbed TCS-Net, for compressive image sensing (or image compressive sensing) with a $\mathit{patch-to-pixel}$ manner. Concretely, the proposed TCS-Net consists of an image acquisition module and a reconstruction module (includes two key decoding phases: a patch-wise decoding phase and a pixel-wise decoding phase). The acquisition module can implement data-driven image sampling by jointly learning with the decoding phases. By adjusting the Transformer architecture to the $\mathit{patch-to-pixel}$ multi-stage pattern, our reconstruction module can gradually decode the CS measurements from the patch-wise outlines to the pixel-wise textures, thereby building a high-precision mapping for image reconstruction. Extensive experiments on several datasets verify that the proposed TCS-Net outperforms existing state-of-the-art image CS methods by considerable margins.
KW - Compressive sensing
KW - image reconstruction
KW - patch-to-pixel
KW - transformer
UR - https://www.scopus.com/pages/publications/85149328352
U2 - 10.1109/TCI.2023.3244396
DO - 10.1109/TCI.2023.3244396
M3 - 文章
AN - SCOPUS:85149328352
SN - 2573-0436
VL - 9
SP - 133
EP - 146
JO - IEEE Transactions on Computational Imaging
JF - IEEE Transactions on Computational Imaging
ER -