Hformer: highly efficient vision transformer for low-dose CT denoising

Shi Yu Zhang; Zhao Xuan Wang; Hai Bo Yang; Yi Lun Chen; Yang Li; Quan Pan; Hong Kai Wang; Cheng Xin Zhao

doi:10.1007/s41365-023-01208-0

Hformer: highly efficient vision transformer for low-dose CT denoising

Shi Yu Zhang, Zhao Xuan Wang, Hai Bo Yang, Yi Lun Chen, Yang Li, Quan Pan, Hong Kai Wang, Cheng Xin Zhao

School of Automation

Research output: Contribution to journal › Article › peer-review

22 Scopus citations

Abstract

In this paper, we propose Hformer, a novel supervised learning model for low-dose computer tomography (LDCT) denoising. Hformer combines the strengths of convolutional neural networks for local feature extraction and transformer models for global feature capture. The performance of Hformer was verified and evaluated based on the AAPM-Mayo Clinic LDCT Grand Challenge Dataset. Compared with the former representative state-of-the-art (SOTA) model designs under different architectures, Hformer achieved optimal metrics without requiring a large number of learning parameters, with metrics of 33.4405 PSNR, 8.6956 RMSE, and 0.9163 SSIM. The experiments demonstrated designed Hformer is a SOTA model for noise suppression, structure preservation, and lesion detection.

Original language	English
Article number	61
Journal	Nuclear Science and Techniques
Volume	34
Issue number	4
DOIs	https://doi.org/10.1007/s41365-023-01208-0
State	Published - Apr 2023

Keywords

Auto-encoder
Convolutional neural networks
Deep learning
Image denoising
Low-dose CT
Medical image
Residual network
Self-attention

Access to Document

10.1007/s41365-023-01208-0

Cite this

@article{27a12a64c22c4d988375a1e6139e2e77,

title = "Hformer: highly efficient vision transformer for low-dose CT denoising",

abstract = "In this paper, we propose Hformer, a novel supervised learning model for low-dose computer tomography (LDCT) denoising. Hformer combines the strengths of convolutional neural networks for local feature extraction and transformer models for global feature capture. The performance of Hformer was verified and evaluated based on the AAPM-Mayo Clinic LDCT Grand Challenge Dataset. Compared with the former representative state-of-the-art (SOTA) model designs under different architectures, Hformer achieved optimal metrics without requiring a large number of learning parameters, with metrics of 33.4405 PSNR, 8.6956 RMSE, and 0.9163 SSIM. The experiments demonstrated designed Hformer is a SOTA model for noise suppression, structure preservation, and lesion detection.",

keywords = "Auto-encoder, Convolutional neural networks, Deep learning, Image denoising, Low-dose CT, Medical image, Residual network, Self-attention",

author = "Zhang, {Shi Yu} and Wang, {Zhao Xuan} and Yang, {Hai Bo} and Chen, {Yi Lun} and Yang Li and Quan Pan and Wang, {Hong Kai} and Zhao, {Cheng Xin}",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s).",

year = "2023",

month = apr,

doi = "10.1007/s41365-023-01208-0",

language = "英语",

volume = "34",

journal = "Nuclear Science and Techniques",

issn = "1001-8042",

publisher = "Springer",

number = "4",

}

TY - JOUR

T1 - Hformer

T2 - highly efficient vision transformer for low-dose CT denoising

AU - Zhang, Shi Yu

AU - Wang, Zhao Xuan

AU - Yang, Hai Bo

AU - Chen, Yi Lun

AU - Li, Yang

AU - Pan, Quan

AU - Wang, Hong Kai

AU - Zhao, Cheng Xin

PY - 2023/4

Y1 - 2023/4

N2 - In this paper, we propose Hformer, a novel supervised learning model for low-dose computer tomography (LDCT) denoising. Hformer combines the strengths of convolutional neural networks for local feature extraction and transformer models for global feature capture. The performance of Hformer was verified and evaluated based on the AAPM-Mayo Clinic LDCT Grand Challenge Dataset. Compared with the former representative state-of-the-art (SOTA) model designs under different architectures, Hformer achieved optimal metrics without requiring a large number of learning parameters, with metrics of 33.4405 PSNR, 8.6956 RMSE, and 0.9163 SSIM. The experiments demonstrated designed Hformer is a SOTA model for noise suppression, structure preservation, and lesion detection.

AB - In this paper, we propose Hformer, a novel supervised learning model for low-dose computer tomography (LDCT) denoising. Hformer combines the strengths of convolutional neural networks for local feature extraction and transformer models for global feature capture. The performance of Hformer was verified and evaluated based on the AAPM-Mayo Clinic LDCT Grand Challenge Dataset. Compared with the former representative state-of-the-art (SOTA) model designs under different architectures, Hformer achieved optimal metrics without requiring a large number of learning parameters, with metrics of 33.4405 PSNR, 8.6956 RMSE, and 0.9163 SSIM. The experiments demonstrated designed Hformer is a SOTA model for noise suppression, structure preservation, and lesion detection.

KW - Auto-encoder

KW - Convolutional neural networks

KW - Deep learning

KW - Image denoising

KW - Low-dose CT

KW - Medical image

KW - Residual network

KW - Self-attention

UR - http://www.scopus.com/inward/record.url?scp=85156129405&partnerID=8YFLogxK

U2 - 10.1007/s41365-023-01208-0

DO - 10.1007/s41365-023-01208-0

M3 - 文章

AN - SCOPUS:85156129405

SN - 1001-8042

VL - 34

JO - Nuclear Science and Techniques

JF - Nuclear Science and Techniques

IS - 4

M1 - 61

ER -

Hformer: highly efficient vision transformer for low-dose CT denoising

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this