探讨基于CitusDB的地质资料集群和大数据架构
Discussion on the clustering and large data architecture of geological data based on CitusDB
-
摘要: 地质资料是地质工作形成的重要成果资料,具有可被重复开发利用、能够长期提供服务的功能。然而,因地质资料的分散式管理,使得地质资料信息存储分散,“孤岛”式服务的现象普遍存在,缺乏资料信息共享、综合利用的机制和手段,制约着地质资料信息潜在价值的有效发挥。地质资料信息服务集群化旨在通过信息领域前沿技术,对地质资料进行集成集群和深度开发,将分散、孤立的地质资料进行分布式汇集,全方位多角度解读、展现、挖掘地质资料信息,充分发挥地质资料服务于经济社会发展的作用。长期的地质调查工作,已经形成了多专业、数据格式多样的海量地质资料,信息服务的集群化必将面临地质大数据相关的技术问题。介绍了地质资料信息服务集群化模式,分析了CitusDB软件的分布式大数据运行机理,探讨了基于CitusDB软件的地质资料集群和大数据服务架构,可为地质大数据与信息服务提供一定的参考。Abstract: Geological data as important information of geological work, can be repeatedly utilized and provide long-term service. But the storage of geological data is decentralized because of decentralized management of geological data, which brings about widespread islanded service. And a lack of mechanisms and methods of information sharing and comprehensive utilization restricts better utilization of geological data. Geological data are clustered and deeply developed by the cutting-edge technologies in information field. And decentralized and isolated geological data are collected spreadly, and the information are explained, revealed and excavated from different aspects. So geological data clustering information service would better serve economic and social development. However, the multi-professional, multi-formatted and massive geological data have formed in long-term geological survey work. Related technical problems of large data would occur during geological data clustering. The authors introduced the geological data clustering information service mode and analyzed the distributed data operation mechanism of CitusDB software. Geological data clustering and large data service framework based on CitusDB software were discussed. This paper would provide some reference for the large geological data and information services.