首页 >  2021, Vol. 25, Issue (11) : 2270-2282

摘要

全文摘要次数: 733 全文下载次数: 678
引用本文:

DOI:

10.11834/jrs.20210587

收稿日期:

2021-01-07

修改日期:

PDF Free   HTML   EndNote   BibTeX
基于CNN-GCN双流网络的高分辨率遥感影像场景分类
邓培芳1,徐科杰1,黄鸿1,2
1.重庆大学 光电技术及系统教育部重点实验室, 重庆 400044;2.重庆大学 煤矿灾害动力学与控制国家重点实验室, 重庆 400044
摘要:

高分辨率遥感影像具有复杂的几何结构和空间布局,传统的卷积神经网络的方法仅能提取场景图像中的全局特征,忽略了上下文的关系,导致特征的表达能力受限,制约了分类精度提高。针对此问题,本文提出一个面向高分辨率遥感影像场景分类的CNN-GCN双流网络,该算法包含CNN流和GCN流两个模块。CNN流基于预训练DenseNet-121网络提取高分影像的全局特征;而GCN流采用由预训练VGGNet-16网络得到的卷积特征图构建邻接图,再通过GCN模型提取高分影像的上下文特征。最后,通过加权级联的方式有效地融合全局特征和上下文特征并利用线性分类器实现分类。本文选取AID、RSSCN7和NWPU-RESISC45共3个具有挑战性的数据集进行实验,得到的最高分类精度分别是97.14%、95.46%和94.12%,结果表明本文算法能够有效地表征场景并取得具有竞争力的分类结果。

CNN-GCN-based dual-stream network for scene classification of remote sensing images
Abstract:

Scene classification is an important research topic, which aims at assigning a semantic label to a given image. High-Spatial-Resolution (HSR) images contain abundant information of ground objects, such as geometric structure and spatial layout. Complex HSR images are difficult to interpret effectively. Extracting discriminative features is the key step to improve classification accuracy. Various methods for constructing discriminative representations, including handcrafted feature-based methods and deep learning-based methods, have been proposed. The former methods focus on designing different handcrafted features via professional knowledge and describing a scene through single feature or multifeature fusion. However, for a complex scene, handcrafted features show limited discriminative and generalization capabilities. Deep learning-based methods, due to the powerful capability of feature extraction, have made incredible progress in the field of scene classification. Compared with the former methods, Convolutional Neural Networks (CNNs) can automatically extract deep features from massive HSR images. Nevertheless, CNNs merely focus on global information, which makes it fail to explore the context relationship of HSR images. Recently, Graph Convolutional Networks (GCNs) have become an important branch of deep learning, and they have been adopted to model spatial relations hidden in HSR images via graph structure. In this paper, a novel architecture termed CNN-GCN-based Dual-Stream Network (CGDSN) is proposed for scene classification. The CGDSN method contains two modules: CNN and GCN streams. For the CNN stream, the pretrained DenseNet-121 is employed as the backbone to extract the global features of HSR images. In the GCN stream, VGGNet-16 that is pretrained well on ImageNet is introduced to generate feature maps of the last convolutional layer. Then, an average pooling is organized for downsampling before the construction of an adjacency matrix. Given that every image is represented by a graph, a GCN model is developed to demonstrate context relationships. Two graph convolutional layers of the GCN stream are followed by a global average pooling layer and a Fully Connected (FC) layer to form the context features of HSR images. Lastly, to fuse global and context features adequately, a weighted concatenation layer is constructed to integrate them, and an FC layer is introduced to predict scene categories. The AID, RSSCN7, and NWPU-RESISC45 data sets are chosen to verify the effectiveness of the CGDSN method. Experimental results illustrate that the proposed CGDSN algorithm outperforms some state-of-the-art methods in terms of Overall Accuracies (OAs). On the AID data set, the OAs reach 95.62% and 97.14% under the training ratios of 20% and 50%, respectively. On the RSSCN7 data set, the classification result obtained by the CGDSN method is 95.46% with 50% training samples. For the NWPU-RESISC45 data set, the classification accuracies achieved via the CGDSN method are 91.86% and 94.12% under the training ratios of 10% and 20%, respectively. The proposed CGDSN method can extract discriminative features and achieve competitive accuracies for scene classification.

本文暂时没有被引用!

欢迎关注学报微信

遥感学报交流群