View-Semantic Transformer with Enhancing Diversity for Sparse-View SAR Target Recognition

Zhunga Liu; Feiyan Wu; Zaidao Wen; Zuowei Zhang

doi:10.1109/TGRS.2023.3293478

View-Semantic Transformer with Enhancing Diversity for Sparse-View SAR Target Recognition

Zhunga Liu, Feiyan Wu, Zaidao Wen, Zuowei Zhang

自动化学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

3 引用（Scopus）

摘要

With the rapid development of supervised learning-based synthetic aperture radar (SAR) target recognition technology, it is easy to find that the recognition performance is proportional to the number of training samples. However, the biased data distribution and under-representation of the model caused by incomplete data within categories exacerbate the challenge of SAR interpretation. In this article, we propose a new view-semantic transformer network (VSTNet) that generates synthesized samples to complete the statistical distribution of training data and improve the discriminative representation of the model. First, SAR images from different views are encoded into a disentangled latent space, which allows us to synthesize data with more diverse views by manipulating view-semantic features. Second, the synthesized data as a complement effectively expands the training set and alleviates the overfitting problem of limited data in sparse views. Third, the proposed method unifies SAR image synthesis and SAR target recognition into an end-to-end framework to boost their performance against each other. Experiments conducted on moving and stationary target acquisition and recognition (MSTAR) data demonstrate the robustness and effectiveness of the proposed method.

源语言	英语
文章编号	5211610
期刊	IEEE Transactions on Geoscience and Remote Sensing
卷	61
DOI	https://doi.org/10.1109/TGRS.2023.3293478
出版状态	已出版 - 2023

访问文件

10.1109/TGRS.2023.3293478

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{c6238225616349deaa0e29d2c9fd5b3d,

title = "View-Semantic Transformer with Enhancing Diversity for Sparse-View SAR Target Recognition",

abstract = "With the rapid development of supervised learning-based synthetic aperture radar (SAR) target recognition technology, it is easy to find that the recognition performance is proportional to the number of training samples. However, the biased data distribution and under-representation of the model caused by incomplete data within categories exacerbate the challenge of SAR interpretation. In this article, we propose a new view-semantic transformer network (VSTNet) that generates synthesized samples to complete the statistical distribution of training data and improve the discriminative representation of the model. First, SAR images from different views are encoded into a disentangled latent space, which allows us to synthesize data with more diverse views by manipulating view-semantic features. Second, the synthesized data as a complement effectively expands the training set and alleviates the overfitting problem of limited data in sparse views. Third, the proposed method unifies SAR image synthesis and SAR target recognition into an end-to-end framework to boost their performance against each other. Experiments conducted on moving and stationary target acquisition and recognition (MSTAR) data demonstrate the robustness and effectiveness of the proposed method.",

keywords = "Incomplete data, sparse views, synthetic aperture radar (SAR) target recognition, view-semantic transformer",

author = "Zhunga Liu and Feiyan Wu and Zaidao Wen and Zuowei Zhang",

note = "Publisher Copyright: {\textcopyright} 1980-2012 IEEE.",

year = "2023",

doi = "10.1109/TGRS.2023.3293478",

language = "英语",

volume = "61",

journal = "IEEE Transactions on Geoscience and Remote Sensing",

issn = "0196-2892",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - View-Semantic Transformer with Enhancing Diversity for Sparse-View SAR Target Recognition

AU - Liu, Zhunga

AU - Wu, Feiyan

AU - Wen, Zaidao

AU - Zhang, Zuowei

PY - 2023

Y1 - 2023

N2 - With the rapid development of supervised learning-based synthetic aperture radar (SAR) target recognition technology, it is easy to find that the recognition performance is proportional to the number of training samples. However, the biased data distribution and under-representation of the model caused by incomplete data within categories exacerbate the challenge of SAR interpretation. In this article, we propose a new view-semantic transformer network (VSTNet) that generates synthesized samples to complete the statistical distribution of training data and improve the discriminative representation of the model. First, SAR images from different views are encoded into a disentangled latent space, which allows us to synthesize data with more diverse views by manipulating view-semantic features. Second, the synthesized data as a complement effectively expands the training set and alleviates the overfitting problem of limited data in sparse views. Third, the proposed method unifies SAR image synthesis and SAR target recognition into an end-to-end framework to boost their performance against each other. Experiments conducted on moving and stationary target acquisition and recognition (MSTAR) data demonstrate the robustness and effectiveness of the proposed method.

AB - With the rapid development of supervised learning-based synthetic aperture radar (SAR) target recognition technology, it is easy to find that the recognition performance is proportional to the number of training samples. However, the biased data distribution and under-representation of the model caused by incomplete data within categories exacerbate the challenge of SAR interpretation. In this article, we propose a new view-semantic transformer network (VSTNet) that generates synthesized samples to complete the statistical distribution of training data and improve the discriminative representation of the model. First, SAR images from different views are encoded into a disentangled latent space, which allows us to synthesize data with more diverse views by manipulating view-semantic features. Second, the synthesized data as a complement effectively expands the training set and alleviates the overfitting problem of limited data in sparse views. Third, the proposed method unifies SAR image synthesis and SAR target recognition into an end-to-end framework to boost their performance against each other. Experiments conducted on moving and stationary target acquisition and recognition (MSTAR) data demonstrate the robustness and effectiveness of the proposed method.

KW - Incomplete data

KW - sparse views

KW - synthetic aperture radar (SAR) target recognition

KW - view-semantic transformer

UR - http://www.scopus.com/inward/record.url?scp=85164751359&partnerID=8YFLogxK

U2 - 10.1109/TGRS.2023.3293478

DO - 10.1109/TGRS.2023.3293478

M3 - 文章

AN - SCOPUS:85164751359

SN - 0196-2892

VL - 61

JO - IEEE Transactions on Geoscience and Remote Sensing

JF - IEEE Transactions on Geoscience and Remote Sensing

M1 - 5211610

ER -

View-Semantic Transformer with Enhancing Diversity for Sparse-View SAR Target Recognition

摘要

访问文件

其它文件与链接

指纹

引用此