首页 >  , Vol. , Issue () : -

摘要

全文摘要次数: 187 全文下载次数: 286
引用本文:

DOI:

10.11834/jrs.20210216

收稿日期:

2020-06-18

修改日期:

2021-01-16

PDF Free   EndNote   BibTeX
面向不平衡高光谱遥感分类的SMOTE和旋转森林动态集成算法
童莹萍1, 冯伟1, 全英汇1, 黄文江2, 高连如2, 朱文涛1, 邢孟道1
1.西安电子科技大学;2.中国科学院空天信息创新研究院
摘要:

旋转森林(Rotation Forest,简写RoF)是一种功能强大的集成分类器,它在高光谱图像分类中已经获得了很多成功的应用。然而,现实数据经常存在类别不平衡的问题,这使得传统的RoF算法侧重识别多数类别的样本,而忽略了少数类样本的分类精度。SMOTE算法通过模拟生成新样本的方式来增加少数类别样本的数量,进而达到平衡数据集类别的效果。但是SMOTE算法目前主要被用于数据预处理阶段,并且在处理多类问题时具有增加人工噪声的风险。为了解决高光谱数据学习中的多类不平衡问题,本文提出了一个新的SMOTE和RoF动态集成算法。该算法利用动态采样因子技术,将类别分布优化和基分类器训练过程进行融合。本实验利用三个公开的高光谱数据对算法的性能进行测试,同时选取四种对比算法,包括随机森林、传统的RoF以及通过随机过采样和SMOTE数据预处理后的RoF算法,并且采用总体分类精度、平均分类精度、F-measure、Gmean、最小召回率、集成分类器多样性、模型训练时间以及McNemar测试等为算法性能评价标准。实验结果表明本文方法具有明显的分类优势,可以保证在增加数据总体分类精度的基础上提高小类别样本的识别精度。

Dynamic ensemble algorithm of SMOTE and rotation forest for imbalanced hyperspectral remote sensing classification
Abstract:

Objective: Rotation Forest (RoF), a powerful ensemble classifier, has obtained many successful applications in hyperspectral image classification. However, the data often has the problem of class imbalance, which makes the traditional RoF algorithm focus on identifying samples of most classes, while ignoring the accuracy of minority samples. The SMOTE algorithm increases the number of minority samples by simulating the way of generating new samples, thereby achieving the effect of balancing the categories of the data set. However, the SMOTE algorithm is mainly used in the data preprocessing stage, and has the risk of increasing artificial noise when dealing with multi-class problems. Therefore, to increase the classification accuracy of the multi-class imbalanced hyperspectral data, a novel dynamic ensemble algorithm based on SMOTE and RoF is proposed in this paper. Method: The proposed algorithm uses dynamic sampling factor technology to merge the class distribution optimization with the base classifier. This algorithm can not only realize the adaptive generation of class balance data set, but also greatly reduce the influence of noise on the base classifier. Result: In this experiment, three public hyperspectral images are used to test the performance of the algorithm, four comparison algorithms are selected, including random forest, traditional RoF as well as RoF algorithm with random oversampling and SMOTE data preprocessing respectively. Overall accuracy, average accuracy, F-measure, Gmean, minimum recall rate, ensemble classifier diversity, model training time and McNemar test are the algorithm evaluation criteria. Conclusion: The experimental results demonstrate the effectiveness of the proposed method. The novel method not only has obvious classification advantages, but also can increase the recognition accuracy of minority samples while keeping the overall classification accuracy of the data.

本文暂时没有被引用!

欢迎关注学报微信

遥感学报交流群 分享按钮