跨视角图像地理定位数据集综述

张硝; 高艺; 夏宇翔; 赵春雪

下载中心

优秀审稿专家

优秀论文

首页 > , Vol. , Issue () : -

摘要

全文摘要次数： 95 全文下载次数： 80

引用本文:

DOI:

10.11834/jrs.20254348

收稿日期:

2024-08-11

修改日期:

2025-01-10

PDF Free EndNote BibTeX

跨视角图像地理定位数据集综述

张硝, 高艺, 夏宇翔, 赵春雪

61175部队

摘要:

跨视角图像地理定位（Cross-view Image Geo-localization）旨在通过不同视角图像间的匹配，检索参考图库中相似度最大的图像，进而利用其GPS标签实现定位功能。传统的单一视角图像地理定位受限于数据集质量、规模以及定位精度等因素，因而近年来众多研究人员和机构发布了一系列跨视角地理定位数据集，为提升地理定位精度打下数据基础。然而目前尚缺乏对跨视角图像地理定位数据集的系统性分析。本文首先梳理了跨视角图像地理定位发展以来的32个经典数据集，从视角信息、构建类型、真实程度、时相信息四个维度构建分类体系并对数据集基本信息进行归纳总结；其次从元数据、影响力、关键词、获取来源以及应用领域五个方面对跨视角图像地理定位数据集进行深入分析，整理概括了跨视角图像地理定位目前的主流算法，最后从数据集多模态趋势、大模型方法、图像干扰物处理以及模型优化四个角度探讨了跨视角定位数据集未来的发展方向，可以为相关领域研究人员提供参考。

关键词:

跨视角，图像地理定位，数据集，深度学习，无人机，图像检索，图像匹配，计算机视觉

A review of cross-view image geo-localization datasets

Abstract:

Cross-view Image Geo-localization aims to retrieve the most similar image from a reference database through matching images captured from different viewpoints, subsequently leveraging images’ GPS tag to fulfill localization tasks. Traditional single-view image geo-localization is limited by factors such as dataset quality, scale, and positioning accuracy. Therefore, in recent years, numerous researchers and institutions have released a series of cross-view geo-localization datasets, laying the data foundation for improving geo-localization accuracy. Nevertheless, there is still a lack of systematic analysis of these cross-view image geo-localization datasets. Objective: Therefore, this paper aimed to provide a comprehensive review of the published cross-view image geo-localization dataset. Method: Based on the literature review, we collect and organize 32 cross-view image geo-localization datasets spanning from the year 2011 to 2024. We review 32 classic datasets that have emerged since the development of cross-view image geo-localization, constructing a classification system from four dimensions: viewpoint information, construction type, authenticity, and temporal information. We summarize the basic information of these datasets in a tabular form, included are the dataset"s name, image resolution, data scale, encompassed scenes, etc. Fully expressing the fundamental attributive characteristics of cross-view geolocation datasets from multiple perspectives. Then we delve into these cross-view image geo-localization datasets from five aspects: metadata, influence, keywords, acquisition sources, and application fields. Additionally, we collate and summarize the mainstream algorithms for cross-view image geo-localization (e. g., network structure optimization, loss function optimization and attention mechanism, etc.). Finally, we discuss the future development directions of cross-view localization datasets from four perspectives: the trend of multimodal datasets, the approach of large language models, image distraction handling, and model optimization. Result: In summary, we offer a comprehensive review of cross-view image geo-localization datasets from various perspectives. To the best of our knowledge, this paper is the first review of on such datasets in the field, which can provide a reference for researchers in related fields. Conclusion: However, the current datasets still face issues such as low data quality, single source of data and weak generalization ability. Thus, further research is needed.

Key Words:

cross-view image geo-localization datasets deep learning unmanned aerial vehicle image retrieval image matching computer vision

本文暂时没有被引用！