Geographic Information System (GIS), also known as “geospatial Information System” or “resource and environment Information System”, is based on geospatial data and adopts Geographic model analysis method to timely provide a variety of spatial and dynamic Geographic Information. A technical system for collecting, storing, managing, calculating, analyzing, displaying and describing geographic distribution data in the whole or part of the earth’s surface space (including the atmosphere) with the support of computer hardware and software systems.

Simply put, without GIS, you wouldn’t even be able to find people nearby or search for good restaurants nearby.

National difference development outline, two-dimensional geographic information step by step to upgrade to 3 d, for the use of land and resources, forestry, water conservancy, transportation, electricity, disaster relief and other important industry development of national economy and people’s livelihood and decision-making to provide high level support (for example, by 3 d modeling of real can simulate the river water level clearly each cm, raise a physical threat to many area, To accurately implement flood prevention strategies), which will push the entire geo-information industry towards greater professionalism and higher standards.

Large-scale geographic information and access to, from the initial surveying and mapping personnel with their feet by measurement of the land, then USES satellite telemetry or plane, as uavs, ground in recent years the rapid development of high precision, high resolution filming equipment, field data collection forms more diverse geographic information, this is the amount of data of explosive growth. The geographic information data collected in a single time for medium-sized cities can reach 5-600TB, the data volume of first-tier cities can even reach 1-2PB, and the size of a single high-precision picture can reach 400MB.

After the data is collected, professional software (such as Smart3D, Bentley’s ContextCapture, etc.) should be used in the industry environment to process hundreds of terabytes of original data (including pictures, videos, point clouds, etc.) (such as space three calculation, annotation and other operations). These specialized software works in a distributed cluster composed of a large number of HPC nodes, which needs to obtain raw data from shared data sources. Does this architecture mean that as long as the data is shared by multiple machines and there are enough computing resources, the mapping of geographic information can be completed quickly? The answer, of course, is no.

Then, what kind of storage support should be selected for large-scale geographic information rendering? From ArcGIS installation manual, we find the recommendations and requirements for storage: https://enterprise.arcgis.com…

Real-time consistency and performance are two basic requirements to consider when choosing a file storage device (NAS/SAN). Before choosing a file storage device, it is important to understand the server directory and configuration storage.

  • Real-time consistency Select a storage device that can read files from any node in the compute cluster as soon as the operation and corresponding writes are complete. When using NFS, you must set up the configuration to ensure that each node can read consistent data and avoid using out-of-date data caches.
  • Performance Choose a storage device with good performance to minimize the impact of random I/O and small files. Different storage devices have different I/O characteristics, and the performance of read and write is very different. Small I/O performance is important because ArcGIS Enterprise I/O operations (such as ArcGIS interactions with configuration warehouses, cache packet slicing, etc.) are a large number of random small I/ Os. This generally means that devices that have been optimized for large I/O sequential reads and writes, which typically occur in images and video, are not suitable for use with ArcGIS Enterprise components. If the selected file store does not handle small random I/O well, users may experience significantly increased response times or even failed project drawing. It can be seen that the performance of storage cluster, especially the performance of random small I/O, is a hard requirement of GIS software.

In the connection with the actual production of users, we find that data sharing storage faces two important challenges:

  • The performance of small I/O reading and writing in shared storage restricts the drawing progress of projects such as real scene modeling.
  • Common shared data stores (such as NAS arrays, or common distributed file stores) cannot support large-scale clusters of Smart3D or ContextCapture computing for concurrent processing.

In an actual project, the user used the Smart3D view for real scene modeling based on the original set of distributed storage. A single set of storage software could only support 60 computing nodes, which was difficult to meet the user’s expectations from the perspective of storage efficiency and computing efficiency utilization. Yan Rong, a technical engineer, built a YRCloudFile storage cluster based on the same server. With the use of YRCloudFile Windows client, the maximum size of the cluster has been supported to 600 computing nodes and the original drawing work completed in 3 days has been shortened to 2 hours.

The application in GIS 3D real scene modeling is another typical application scenario where YRCloudFile gives full play to its performance and large-scale concurrency. With the support of high-performance storage of YRCloudFile, the work efficiency of geographic information projects will be greatly improved.