首页 > , Vol. , Issue () : -
遥感影像时间序列为土地覆盖分类研究提供了重要的数据基础，利用深度学习提取时序分类特征一直是研究的热点，而基于循环网络和卷积网络的深度学习模型在训练样本不均衡时往往难以在小样本地类上取得高精度分类结果，针对这一问题，本文引入自然语言处理领域最新的自注意力机制方法用于多光谱遥感时序数据分类。通过对Transformer编码器进行两点改进：(1) 在多头注意力前添加特征升维层，提升数据的光谱信息；(2) 使用拉伸后降维取代全局最大值池化(Global Maximum Pooling, GMP)作为特征维度降维策略。构建基于时序自注意力机制的特征提取网络，与循环网络和卷积网络进行对比，利用公开的多光谱遥感时序数据集评估本文所用方法对于小样本类别精度提高的有效性。实验结果表明本文基于时序自注意力机制构建的特征提取网络能够有效应用于多光谱遥感时序数据分类问题，并对小样本地类分类精度提升有所帮助。
(Objective) With the rapid development of remote sensing technology, the continuously accumulated remote sensing time series data provides an important data support for the study of land cover classification。It is always a hot topic to extract classificational discriminative features from remote sensing time series data by using deep learning methods. Deep learning methods require a larger number of training data, but the imbalance of samples makes it difficult for commonly used recurrent networks and convolutional networks to obtain high accuracies in categories that have small number of samples. To deal with this problem, this paper introduces the self-attention mechanism originated in the field of natural language processing to the classification of multispectral remote sensing time series data, with the aim of extracting deep temporal features at a global scale. In contrast, recurrent networks extract temporal feature by using previous time information along temporal dimension and convolutional networks extract temporal feature at local time neighborhood. (Method) We construct a new feature extraction network based on Transformer Encoder which firstly employs self-attention mechanism in natural language processing, and then compare it with Long Short Term Memory (LSTM) based feature extraction network and Temporal Convolution Neural Network (TempCNN) based feature extraction network for the purpose of evaluating the effectiveness of self-attention mechanism based method on improving the classification accuracy of small-sample categories. To achieve a fair comparison we adopt a generic classification framework consisting of data input, feature extraction network, classifier and classification output, and we use different models with different hyperparameters as the feature extraction networks. Then we evaluate the classification performance of different methods on TiSeLaC public multispectral remote sensing time series dataset using per-class accuracy, overall accuracy (OA) and mean intersection over union (mIoU) as metrics. (Result) To obtain a proper measure of different methods we choose top 3 best mIoU hyperparameter settings for each model and calculate average metrics as the final result. The results show that self-attention based network has better performance than recurrent network and convolutional network. Self-attention based method achieves 92.98% in OA and 80.60% in mIoU that is 1.25% and 1.32% higher than recurrent network and convolutional network respectively. In terms of per-class accuracy, while self-attention based network achieves equivalent accuracies with differences less than 0.74% in majority-sample categories compared with recurrent and convolutional networks, it could significantly improve classification accuracies in small-sample categories by large margins from 2.47% to 5.41%. (Conclusion)This paper introduces self-attention mechanism to the classification of multispectral remote sensing time series data to cope with the problem of low classification accuracy in small-sample categories caused by the imbalance of samples. We construct a new temporal feature extraction network based on self-attention mechanism to globally extract temporal feature from time series and design a set of objective comparison experiments. Based on the results of experiments, we can conclude that the way self-attention mechanism globally extracts temporal feature from time series, compared to the recurrent network using previous time information and the convolutional network focusing on local time neighborhood, could achieve the same accuracy in majority-sample categories and, meanwhile, effectively improve the accuracy in small-sample categories. We believe self-attention mechanism based methods could play an important role in the classification of remote sensing time series in the future and further researches are of great necessity.