特征提取策略对高分辨率遥感图像场景分类性能影响的评估

Translated title of the contribution: Evaluation of the effect of feature extraction strategy on the performance of high-resolution remote sensing image scene classification

Xiaoliang Qian, Jia Li, Gong Cheng, Xiwen Yao, Suna Zhao, Yibin Chen, Liying Jiang

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Remote sensing image scene classification aims to tag remote sensing images with semantic categories according to the content of the image and is important in disaster monitoring, environmental detection, and urban planning. Scene classification results can provide valuable information about object recognition and image retrieval and can effectively improve the performance of image interpretation. The general process of remote sensing image scene classification mainly consists of feature extraction and scene classification based on image features. Given that the design of classifiers is relatively mature, this work focuses on feature extraction strategy. The influence of various strategies on the performance of scene classification is short of unified evaluation, which limits its development. The effect of various feature extraction strategies on the performance of high-resolution remote sensing image scene classification is evaluated in this study. In the second section of this paper, existing feature extraction strategies are divided into two categories: (1) hand-designed and (2) data-driven feature extraction. Hand-designed features, such as Color Histograms (CH) and Scale Invariant Feature Transform (SIFT), provide the primary description of images and are presented in the early period. Further abstract description of the images is introduced by coding of hand-designed features, such as Bag of Visual Words (BoVW) and has higher classification accuracy than hand-designed features. However, these feature extraction strategies generally suffer from poor generalization capability due to specific requirements for designing. Furthermore, hand-designed features require significant domain knowledge. By contrast, data-driven features can directly learn powerful features from a large number of sample images and are generally divided into shallow and deep learning features. Shallow learning feature extraction mainly involves Principal Component Analysis (PCA), Independent Component Analysis (ICA), and sparse coding algorithms. Typical deep learning feature extraction strategies include stacked autoencoder (SAE), Deep Belief Network (DBN), and Convolutional Neural Network (CNN). Compared with deep learning models, shallow learning models can be regarded as a neural network with a single hidden layer and thus cannot capture high-level semantic features. The superiority of deep learning features is obvious when dealing with complex scene classification. Furthermore, CNN-based features exhibit improved performance compared with SAE- and DBN-based features because the one-dimensional structure of SAE and DBN destroys the spatial information of images. In the third section of this paper, 29 feature descriptors are quantitatively compared in UC Merced, AID, and NWPU RESISC-45 datasets and eight combinations of feature descriptors are quantitatively compared in the NWPU RESISC-45 dataset. The effect of different feature extraction strategies on the performance of scene classification and the complexity of each dataset are evaluated through quantitative comparison. The experimental results are as follows. (1) The classification accuracy and stability of hand-designed features is poor, however the efficiency of most features is satisfactory and can attain better performance by combining with other types of features. (2) Among all feature extraction strategies, the coding of hand-designed features possesses moderate levels of classification accuracy, efficiency, and stability. (3) The classification accuracy and stability of data-driven features are best, but most of them have low efficiency. (4) AlexNet, a deep learning model with few layers, exhibits the best comprehensive performance and is suitable for occasions that require high classification accuracy, efficiency, and stability. (5) Some scene classes belonging to land use type are easy to be confused because of similar landmark buildings or sites. Moreover, some scene classes belonging to land cover type are easy to be confused because of their similar geomorphologic features. (6) The recently proposed NWPU RESISC-45 dataset is more complex than the other datasets and is more challenging for scene classification algorithms. Finally, the summary and conclusion of this paper are presented, and the discussion of future development is provided. On the one hand, combining prior knowledge introduced by hand-designed features with the CNN model may be one of the future development directions. On the other hand, introducing Generative Adversarial Networks (GAN) into CNN training may be a research hotspot in the future. In addition, remote sensing parameters, such as NDVI and NDWI, and multi-spectral information can be integrated with current feature extraction strategies for practical applications.

Translated title of the contributionEvaluation of the effect of feature extraction strategy on the performance of high-resolution remote sensing image scene classification
Original languageChinese (Traditional)
Pages (from-to)758-776
Number of pages19
JournalYaogan Xuebao/Journal of Remote Sensing
Volume22
Issue number5
DOIs
StatePublished - Sep 2018

Fingerprint

Dive into the research topics of 'Evaluation of the effect of feature extraction strategy on the performance of high-resolution remote sensing image scene classification'. Together they form a unique fingerprint.

Cite this