A graph database is a database that uses graph structures for semantic queries and uses nodes, edges, and attributes to represent and store data. Graph database has a wide range of applications. It can be used to calculate the connection between things, such as friend recommendation in the social field, risk control management in the financial field, real-time recommendation of goods in the retail field and so on.

Nebula Graph overview and Architecture

Nebula is a high-performance, linearly scalable, open source distributed graphics database with a storage and computing architecture that allows computing and storage layers to expand and shrink flexibly. This means Nebula can take advantage of cloud native technologies for flexible scalability and cost control. Graph database solutions capable of holding hundreds of billion vertices and trillions of edges and providing millisecond query latency.

Nebula Graph

The Nebula cluster contains three core services, Graph Service, Meta Service, and Storage Service. Each service consists of several copies that are evenly distributed on the deployment nodes according to the scheduling policy. The corresponding Graph Service process is nebula- Graphd, which consists of stateless, unrelated compute nodes that do not communicate with each other. The main function of Graph Service is to parse nGQL text sent by the client, generate execution plans through Lexer and Parser, and deliver execution plans to the execution engine after optimization. The execution engine obtains the schema of points and edges through Meta Service, and obtains the data of points and edges through the storage engine layer. The Meta Service process is nebula- Metad, which implements a distributed cluster based on the Raft protocol. The Meta Service leader is selected by all Meta Service nodes in the cluster to provide services. The Followers are on standby and copy the updated data from the Leader. Once the leader node becomes down, a follower is elected as the new leader. Meta Service is not only responsible for storing and providing Meta information of graph data, such as the field types of Space, Schema, Partition, Tag and Edge attributes, but also responsible for operation and maintenance operations such as directing data migration and leader change. The Storage Service is a nebula storaged process with a shared-nothing distributed architecture. Each Storage node is equipped with multiple local KV Storage instances as its core physical Storage. Nebula uses Raft to ensure consistency between these KV stores. Currently, the storage engines supported are Rocksdb and HBase. Nebula Graph provides a variety of C++, Java, Golang, Python, Rust clients that communicate with the servers over RPC and facebook-thrift protocols. You can also use Nebula Graph through Nebula Console and Nebula Studio.

Cloud architecture challenges

Nebula Graph’s cloud product is positioned as a DBaaS (Database-as-A-Service) platform, so there’s no question cloud native technologies are needed to achieve this goal. How exactly do you land? The first thing to be clear is that no technology is a silver bullet, only the right technology for the right scenario. Although we have many open source products to choose from to build this platform, there are still many challenges in delivering products to customers. Here ARE three challenges:

Business challenges

Multiple cloud vendor resource adapter, here need to implement a unified resource abstract model, while completes the internationalization, international need to consider the regional cultural differences, local laws and regulations, customer spending habits differences, and other factors, these factors determine the need in the use of design patterns up to cater to the local user habit, so as to improve the user experience.

Performance challenge

In most cases, data that travels over the same cloud vendor’s network moves much faster than data that must travel over the global Internet from one cloud vendor to another. This means that network connectivity across clouds can be a serious performance bottleneck for a multi-cloud architecture. Data silos are hard to break because companies cannot migrate data in different formats that reside in different technologies, and lack of portability poses potential risks to a multi-cloud strategy. Within a single cloud vendor, it is easy to configure automatic scaling of workloads using the cloud vendor’s native automatic scaling tool, which becomes tricky when users’ workloads span multiple cloud vendors.

Operational challenges

Large-scale Kubernetes cluster operation is very challenging things, to meet the rapid development of business and user needs is also a great test for the team. First, standardize and visualize the management of the cluster, and then streamline all operation and maintenance operations. This requires a management platform with a deep understanding of the pain points of operation and maintenance, which can solve most of our operation and maintenance needs. Data security concerns Migrating data from one platform to another (or from one region to another) without proper governance and security controls creates data security risks.

DBaaS (Database – as – a – Service)

Cloud native technology is simply to provide users with a simple, agile, flexible, scalable, and replicable way to maximize the use of resources on the cloud. The continuous evolution of cloud native technology is also for users to better focus on business development. You can see the pyramid, from IaaS to the top of the cloud native application layer, product form more flexible, the cell size is more and more thin, modularity, automation degree of operational, elastic efficiency, fault recovery ability is more and more high, this shows that every walk up a layer, application and the underlying physical infrastructure decoupling the more thoroughly, Instead of focusing on the entire chain from the hardware server to the business implementation, users need to focus on the current business itself.

PaaS platform’s container orchestration system is Kubernetes, it is natural to think of building this platform based on Kubernetes, Kubernetes provides the container runtime interface, you can choose any of the implementation of this interface runtime to build the basic environment of application running. Therefore, using the capabilities provided by Kubernetes can achieve twice the result with half the effort. Kubernetes provides from the command line terminal Kubectl to the container to run dependent storage, network, computing extension points, users can according to business scenarios to achieve some custom extension plug-ins to receive Kubernetes platform, without worrying about intrusion.

User view

NebulaCloud currently offers users two ways to access their service, one via a browser into the Studio window, where they can explore diagrams and execute nGQL statements after their data is taken into service. The other is to connect users to NebulaCloud via the vendor’s private-link service, which connects them directly to nebula instances through Nebula Console or Nebula Client.

NebulaCloud architecture

In terms of service architecture, NebulaCloud can be divided into three layers, the bottom being the resource adaptation layer, which provides abstract description of cloud vendors, regional clusters, homogeneous or heterogeneous resource pools. The next layer is the business layer and the resource layer. The business layer covers basic services, instance management, tenant management, billing management, data import management and other business modules. The resource layer is responsible for providing the environment for the Nebula cluster, providing optimal resource configuration under scheduling policies. The top layer is the gateway layer, which provides external access services.

Nebula Graph

NebulaCloud’s internal process

This is a strategic description of the creation of a Nebula cluster using AWS as an example. After the nebula Operator request is submitted, the Nebula Platform service schedules resource pools, load balancing, and security policies using the Nebula Operator API to create the instance and configure the ALB rules. Provide users with access to the instance.

Nebula-Operator

In Kubernetes, define a new object can have two ways, one is CustomResourceDefinition, one is the Aggregation ApiServer, the CRD is the practice of the current mainstream, Nebula – Operator is implemented by CRD. CRD+Custom Controller is a typical Operator mode. By registering CRDS with the Kubernetes system, we can use the Controller to observe the Nebula cluster and the state of the resource objects associated with it, and then follow the written coordination logic to drive the Nebula cluster to the desired state. NebulaGraph takes care of nebula-related management into the Operator, reducing the complexity of NebulaGraph and making it easy to take care of core operations such as elastic expansion, shrinkage, and rolling upgrades. We built an API for managing the Nebula cluster based on kubernetes’ Restful API so that users could plug into their own PaaS platform and build their own graph computing platform. Nebula operator is still a work in progress, with low-level support provided by Nebula for rolling upgrades to instances that are expected to be available later this year.

KubeSphere multi-cluster management

Platform management

KubeSphere derives from the operation panel of Qingyun public cloud. In addition to inheriting the appearance level, it is also quite complete in function. NebulaCloud is backed by all the mainstream cloud providers that support their service, so one service takes care of the whole Kubernetes cluster. Multi-cluster management is our most important feature. We deployed the Host cluster in the local environment, and the Kubernetes cluster hosted on the other cloud was directly connected as a Member cluster. Here, we need to pay attention to the ApiServer access configuration and release a single IP, such as the egress public IP of the local environment.

Process-based operation

IaC tool Pulumi was used to deploy the new cluster, and automatic script tool was used to set the member role of the cluster to be managed, without manual operation. Cluster creation is triggered by the alarm module of the platform. When the resource quota of a cluster reaches the alarm watermark, a new cluster is automatically created.

Automatic monitoring

KubeSphere provides a variety of built-in alarm policies and supports custom alarm policies. The built-in alarm policies can basically cover daily monitoring indicators. There are also a variety of choices in the way of alarm, we use the combination of mail and nail, important and urgent can be nailed directly to the staff on duty through the nail, ordinary level can go through the mail.

Smart Operation KubeSphere provides a multi-dimensional global display view of the cluster, and currently manages a small enough number of clusters. In the future, with the increase of the number of access member clusters, resources can be refined scheduling and fault prediction through the analysis of operation data, so as to further discover risks in advance and improve the quality of operation.

other

KubeSphere also has many useful supporting tools, such as log query, event query, operation audit log, etc., these tools are essential in fine operation. We have already connected to the test environment cluster, and will try to connect to the production cluster after in-depth use and mastering the full picture of KubeSphere.

The future planning

We will take advantage of the custom alarm policies and use them to create a panoramic view of the Nebula Cluster’s own monitoring metrics; The multi-level and multi-dimensional alarm mechanism covering the core indicators eliminates risks at the source; Improve the surrounding supporting tools, reduce the risk of misoperation through active, passive and process; Enable DevOps workflow to open up development, test, pre-release, and production environments and reduce human involvement.

Nebula Graph cloud Architecture Management practice based on Kubesphere