首页 >  , Vol. , Issue () : -

摘要

全文摘要次数: 755 全文下载次数: 567
引用本文:

DOI:

10.11834/jrs.20231638

收稿日期:

2021-10-05

修改日期:

2022-04-23

PDF Free   EndNote   BibTeX
改进CenterNet在遥感图像目标检测中的应用
田壮壮1, 张恒伟1, 王坤1, 刘盛启2, 邹前进1, 赵镇1, 陈育斌1
1.电子信息系统复杂电磁环境效应国家重点实验室;2.国防科技大学电子科学学院
摘要:

为了提高遥感图像目标检测的效率及精度,本文提出了一种基于改进CenterNet的遥感图像目标检测方法。基于CenterNet的检测框架,该方法能够降低目标检测所需要的步骤,减少对锚框的依赖。而在CenterNet的基础上,所提方法通过采用带有转置卷积的ResNet作为骨干网络,降低了骨干网络的参数数量;然后针对训练用的热力图标签,提出了针对中心点设计的高斯核适用范围边长的计算方法;最后利用注意力机制,提高所提取特征中目标区域特征的有效性。在公开的高分辨率遥感图像上的实验结果表明,三种改进措施将目标检测的精度提高了4.0%,与此同时所需的检测时间降低为原来的31.9%。与其他对比方法相比,所提方法在精度和速度上均有一定的优势,表明所提方法在遥感图像目标检测中具有一定的实用性。

Application of an improved CenterNet in remote sensing images object detection
Abstract:

Nowadays, object detection methods based on deep learning are widely used in the interpretation of remote sensing images. The anchor-based methods usually need to design the anchor boxes first, which requires more detection steps and time cost. This paper proposed a remote sensing image object detection method based on the improved CenterNet. The method can simplify the object detection process and improve efficiency. The CenterNet uses a fully convolutional network to directly predict the heat map of the center points, widths and heights of the corresponding objects, and the position offsets of the center points. The heat maps are used to generate the rough positions of the objects, the offsets can fine-tune the positions to make it more accurate. The widths and heights further constitute the shape of the object boxes. The different heat maps decide the object categories. On the basis of CenterNet, the proposed method first adopts the ResNet with transposed convolution as the backbone network. The transposed convolution can expand the output feature maps, and ResNet can reduce the number of parameters in the backbone network compared with the Hourglass network. Secondly, the proposed method defines the length of Gaussian kernel under three limit conditions between the predicted and real boxes in CenterNet. The Gaussian kernel is applied to generate the heat map label which is used for network training. Finally, the multi-head attention mechanism is introduced into the backbone network to learn the importance of each element in the feature maps. The weights of the elements mean their effectiveness, which makes the effective features concentrate in the regions of the object key points as much as possible. The experiments use mean average precision (mAP) to evaluate the object detection results on the multiple categories. All the experiments are conducted at the DIOR datatset. The results show that the CenterNet using the ResNet with transposed convolution is 1.4% higher than that using the Hourglass. The proposed calculation of the length of the Gaussian kernel can increase mAP by 1.1%. The addition of attention mechanism can further improve the mAP by 1.5%. At the same time, the time cost of the proposed method reduces to 31.9% compared with the conventional method. The experimental results show that the proposed method can improve detection accuracy without sacrificing the detection speed. The ablation experiments of different parts also show that the ResNet with transposed convolution, the designed calculation method of the length of the Gaussian kernel and the attention mechanism can effectively improve the mAP. The comparison with other methods also proves the proposed method is practical.

本文暂时没有被引用!

欢迎关注学报微信

遥感学报交流群