Elasticsearch is used to build highly available and scalable systems. Scaling can be done by purchasing better servers (vertical scale or scaling up) or by purchasing more servers (horizontal scale or scaling out).

Elasticsearch gets better performance from more powerful hardware, but vertical scaling has its limitations. True scaling should be horizontal, spreading the load and increasing reliability by adding nodes.

For most databases, scaling out means that your program will have to change a lot to take advantage of these new devices. Elasticsearch, by contrast, is distributed by nature: it knows how to manage nodes to provide high scalability and availability. This means that your program doesn’t need to care. In this chapter we’ll explore how to create your clusters, nodes, and shards to scale to your needs and keep your data safe in the event of hardware failures.

An empty cluster

If we start a single node without data or indexes, the cluster looks like Figure 1.

Figure 1: Cluster with one empty node

A node is an instance of Elasticsearch, and a cluster consists of one or more nodes with the same cluster.name that work together to share data and load. When a new node is added or a node is deleted, the cluster senses and balances the data.

A node in the cluster is elected master, which temporarily manages changes at the cluster level, such as creating or deleting indexes, adding or removing nodes, and so on. The master node does not participate in document-level changes or searches, which means that the master node does not become a bottleneck for the cluster when traffic increases. Any node can be the primary node. The cluster in our example has only one node, so it acts as the primary node.

As users, we can communicate with any node in the cluster, including the primary node. Each node knows which node the document resides on, and they can forward requests to the appropriate node. The nodes we visit are responsible for collecting the data returned by each node and finally returning it together to the client. This is all handled by Elasticsearch.

Cluster health

There are many things you can monitor and count in Elasticsearch clusters, but only one is the most important: Cluster Health. Cluster health can be in green, yellow, or Red states.

GET /_cluster/health
Copy the code

Running the above query in an empty cluster with no indexes returns this information:

{
   "cluster_name":          "elasticsearch",
   "status":                "green", <1>
   "timed_out":             false,
   "number_of_nodes":       1,
   "number_of_data_nodes":  1,
   "active_primary_shards": 0,
   "active_shards":         0,
   "relocating_shards":     0,
   "initializing_shards":   0,
   "unassigned_shards":     0
}
Copy the code

The status field provides a comprehensive indicator of the cluster’s service status. The meanings of the three colors:

In the following chapters, we will explain what primary shard and Replica shard are and what these colors (states) mean in the real world.

Adding indexes

To add data to Elasticsearch, we need an index — a place to store associated data. In fact, an index is just a “logical namespace” that points to one or more shards.

A shard is a minimal “worker unit” that holds only a fraction of all the data in the index. We’ll explain how sharding works in more detail in the next chapter on In-depth Sharding, but for now we just need to know that sharding is an instance of Lucene and a complete search engine in its own right. Our documents are stored in the shard and indexed in the shard, but our application does not communicate directly with them. Instead, it communicates directly with the index.

Sharding is the key for Elasticsearch to distribute data in clusters. Think of shards as containers for data. Documents are stored in shards, which are then distributed to nodes in your cluster. When your cluster expands or shrinks, Elasticsearch will automatically migrate shards between your nodes to keep the cluster in balance.

A shard can be a primary shard or a replica shard. Each document in your index belongs to a separate master shard, so the number of master shards determines how much data the index can store.

In theory there is no limit to how much data a master shard can store, depending on how you actually use it. The maximum size of sharding depends entirely on your usage: the size of your hardware storage, the size and complexity of your documents, how you index and query your documents, and the response time you expect.Copy the code

A replicated shard is simply a copy of the master shard, which protects against data loss due to hardware failures and provides read requests, such as searching or retrieving documents from other shards.

When the index is created, the number of master shards is fixed, but the number of replica shards can be adjusted at any time.

Let’s create an index called Blogs on the only empty node in the cluster. By default, an index is allocated 5 master shards, but for demonstration purposes, we only allocate 3 master shards and 1 replicated shard (one replicated shard for each master shard) :

PUT /blogs
{
   "settings" : {
      "number_of_shards" : 3,
      "number_of_replicas" : 1
   }
}
Copy the code

Single-node cluster with index:

Our cluster now looks like the one above — all three primary shards are assigned to Node 1. If we now examine Cluster-Health, we will see the following:

{
   "cluster_name":          "elasticsearch",
   "status":                "yellow", <1>
   "timed_out":             false,
   "number_of_nodes":       1,
   "number_of_data_nodes":  1,
   "active_primary_shards": 3,
   "active_shards":         3,
   "relocating_shards":     0,
   "initializing_shards":   0,
   "unassigned_shards":     3 <2>
}
Copy the code

<1> The cluster status is now yellow

<2> Our three replicated shards have not yet been assigned to the node

The cluster health status yellow indicates that all primary shards are up and running — the cluster is ready to handle requests — but not all replica Shards are available. In fact, all three replication shards are now in unassigned state — they have not yet been assigned to the node. It is not necessary to keep identical copies of data on the same node; if the node fails, all copies of data will be lost.

Now our cluster is fully functional, but there is still a risk of data loss due to hardware failures.

failover

Running on a single node means there is a risk of a single point of failure — no data backup. Fortunately, the only thing we need to do to prevent a single point of failure is to start another node.

To test what happens when you add a second node, you can start the second node in the same way as the first node, with the command line in the same directory -- multiple Instances of Elasticsearch can be started from one node. As long as the second node and have the same cluster. The first node name (see. / config/elasticsearch. Yml file), it can automatically find and join the first node in the cluster. If not, check the log to find out what went wrong. This may be because network broadcast is disabled or a firewall is preventing node communication.Copy the code

If we start the second node, the cluster will look like the following figure. Two-node cluster — all primary and replication shards are allocated:

The second node has been added to the cluster and three replica shards have been allocated — three master shards each, which means data integrity can be maintained even if any node is lost.

The index of the document is first stored in the master shard and then copied concurrently to the corresponding replication node. This ensures that our data is retrievable on both the master and replication nodes.

Cluster-health is now in green status, which means all six shards (three master shards and three replicate shards) are available:

{
   "cluster_name":          "elasticsearch",
   "status":                "green", <1>
   "timed_out":             false,
   "number_of_nodes":       2,
   "number_of_data_nodes":  2,
   "active_primary_shards": 3,
   "active_shards":         6,
   "relocating_shards":     0,
   "initializing_shards":   0,
   "unassigned_shards":     0
}
Copy the code

Our cluster is not only fully functional, but also highly available.

Horizontal scaling

As the demand for applications grows, how do we scale? If we start the third node, our cluster will reorganize itself, as shown in Figure 4:

Node3 contains one shard from Node 1 and one from Node 2, so there are two shards per Node, one less than before, which means that shards on each Node get more hardware resources (CPU, RAM, I/O).

Sharding itself is a complete search engine that can use all the resources of a single node. We have six shards (three master shards and three replica shards), which can be scaled up to six nodes with one shard on each node, and each shard can use 100% of the node’s resources.

Continue to expand

What if we want to scale to more than six nodes?

The number of primary shards is determined when the index is created. In fact, this number defines the maximum amount of data that can be stored in the index (the actual number depends on your data, hardware, and application scenario). However, either master or replication shards can handle read requests — search or document retrieval — so the more redundant the data, the more search throughput we can handle.

The number of replicated shards can change dynamically within a running cluster, allowing us to scale up or down as needed. Let’s increase the number of copied shards from 1 to 2:

PUT /blogs/_settings
{
   "number_of_replicas" : 2
}
Copy the code

Figure 5: Increment number_of_replicas to 2:

As you can see from the figure, the Blogs index now has nine shards: three master shards and six copy shards. This means we can scale up to nine nodes, once again becoming one shard per node. This gives us a three-fold increase in search performance compared to the original three-node cluster.

Of course, in the same number of nodes to add more copy shard does not improve performance, because to do so, on average, each subdivision of the possession of the hardware resource is reduced (translator note: most of the requests are gathered in shard fewer nodes, leading to a node throughput is too big, but reduce the performance), you need to increase the hardware to improve throughput. But these extra replication nodes give us more redundancy: by setting up the nodes above, we can withstand two node failures without losing data.Copy the code

Deal with failure

We’ve already said that Elasticsearch can handle node failures, so let’s keep trying. If we kill the process on the first node, our cluster will look like this:

Figure 5: Cluster after killing the first node

The node we killed was a primary node. A cluster must have a primary Node for it to function properly, so the first thing the cluster does is each Node elects a new primary Node: Node 2.

Master shards 1 and 2 were lost when we killed Node 1, and our index didn’t work properly when we lost the master shard. If we check cluster health at this point, we will see the state RED: not all master shards are available!

Fortunately, full copies of the two missing master shards exist on other nodes, so the first thing the new master Node does is upgrade these replicated shards on Node 2 and Node 3 to master, at which point the cluster health returns to yellow. The lift is instantaneous, like flipping a switch.

Why is the cluster health status yellow instead of green? We have three master shards, but we specified two replication shards for each master shard, and currently only one replication shard is allocated. This is why the cluster state cannot reach green, but don’t worry too much about this: When we kill Node 2, our program can still run without losing data, because Node 3 still has a copy of each shard.

If we restart Node 1, the cluster will be able to redistribute the lost replicas, similar to figure 5 in the previous section: Increasing number_of_replicas to 2. If Node 1 still has copies of the old shards, it will attempt to reuse them, and it will only copy from the master shard the portion of the data that changed during the failure.

You should now have a clear idea of how sharding allows Elasticsearch to scale horizontally and ensure data security. We’ll discuss the sharding life cycle in more detail next.

Reference: official es documentation