首页 >  2022, Vol. 26, Issue (6) : 1051-1066


全文摘要次数: 608 全文下载次数: 421






PDF Free   HTML   EndNote   BibTeX
GeoCube: 面向大规模分析的多源对地观测时空立方体
1.武汉大学 遥感信息工程学院, 武汉 430079;2.湖北省空间信息智能处理工程技术研究中心, 武汉 430079;3.广东南方数码科技股份有限公司, 广州 510665


GeoCube: A spatio-temporal cube toward massive and multi-source EO data analysis

The volume of Earth Observation (EO) data has tremendously increased after the establishment of EO system. Managing such big EO data and turning them into valuable information is a major challenge in EO domain. This study proposes a multisource EO cube toward large-scale analysis.The infrastructure accommodates multisource geospatial data including raster and vector data. A cube model is designed, and four dimensions including product, space, time, and band dimension are formalized. Several cube explore examples are presented. The infrastructure enables large-scale analysis based on cloud computing technology, and a set of distributed cube objects extending Spark Resilient Distributed Dataset for cube tiles is designed. The distributed cube objects are compatible with multiple data source including raster and vector data. A multi-thread computing method is used together with cloud computing, which forms a hybrid parallelism, to further improve data access and processing efficiency. Batch computation is also used to address the issue that massive number of tiles cannot be loaded into memory at one time. Moreover, a machine learning-based approach is integrated into the cube to enhance parallel geoprocessing. The computational intensity of tiles can be predicted and saved in databases in advance, which eliminates the extra time cost of computational intensity prediction on the fly for those commonly used products. The design and implementation for the cube infrastructure, named GeoCube, is provided. It covers the ingestion and management of multisource geospatial data in the cube, the processing of geospatial/EO queries against different cube dimensions, and high-performance cube computing of large-scale geospatial datasets. The creation of such a geospatial data cube help advance the EO data cube approach while keeping connections to the data cube in the BI domain.The performance on data query and access, data processing, and load balance is presented. Results demonstrate the advantage of GeoCube infrastructure. Several applications are presented including cube OLAP operations, large-scale time-series analysis, and multisource data cube analysis.In conclusion, compared with existing cube approaches, the proposed infrastructure emphasizes the accommodation of multisource geospatial data including raster and vector data in the cube, cube tile processing with cloud computing, and artificial intelligence machine learning-enabled cube computation. Such a cube can inherit not only the large-scale processing capabilities of EO data cubes but also the data management capabilities of BI data cubes.


