Table of contents

Set up Kibana2. Cluster health 3. Perform index operations

1. Build Kibana

As explained in the Kibana User Manual, Kibana is an open source data analysis and visualization platform, so we can interact with Elasticsearch (ES) with Kibana.

Download and extract:

cd /usr/local

Wget HTTP: / / https://artifacts.elastic.co/downloads/kibana/kibana-6.6.1-linux-x86_64.tar.gz

Tar-zxvf Kibana-6.6.1-linux-x86_64.tar.gz-c.

Copy the code

Kibana start:

cdKibana - 6.6.1 - Linux - x86_64 /

bin/kibana

Copy the code

The default address of ES for Kibana is localhost. This configuration is in config/kibana.yml. You can modify this configuration file if necessary. If Kibana is successfully launched, we can write the ES command in the Console under the Dev Tools menu. Other features of Kibana will be described later.

2. The cluster is healthy

Before performing an index operation, it is important to understand the health of the ES cluster, which will help us understand the results of the operation after the command of the index operation is executed.

To view the cluster health status, run the GET _cluster/health command. Result:

{

  "cluster_name" : "elasticsearch".

  "status" : "green".

  "timed_out" : false.

  "number_of_nodes" : 1.

  "number_of_data_nodes" : 1.

  "active_primary_shards" : 1.

  "active_shards" : 1.

  "relocating_shards" : 0.

  "initializing_shards" : 0.

  "unassigned_shards" : 0.

  "delayed_unassigned_shards" : 0.

  "number_of_pending_tasks" : 0.

  "number_of_in_flight_fetch" : 0.

  "task_max_waiting_in_queue_millis" : 0.

  "active_shards_percent_as_number" : 100.0

}

Copy the code

Based on the results, we can see that the current cluster health is green and active_shards_PERCENT_AS_number is 100.0. Of course there is.

Normally, the cluster health status is green, yellow, or Red.

  • Green: The primary shard and replica shard of each Index are in the active state and have been assigned. Cluster 100% available
  • Yellow: The primary shard of each Index is in the Active state, but at least one Replica Shard is not in the Active state. At this point, no data is lost and the cluster is still available
  • Red: At least one primary Shard is not active and all replica Shards of the primary shard are missing. At this point, the data has already been lost, and only part of the data will be returned during the search

When the cluster status is not green, we can use GET _cluster/health? The level=indices command looks at the details of each Index to determine which Index has a problem so that the problem can be resolved.

A value of 0 in relocating_shards in the output indicates that no fragments are currently being migrated from one node to another. Why do sharding migrations occur? Because ES is a distributed search and analysis engine, and distribution often corresponds to massive data, ES implements shard load balancing function. When new nodes are added or existing nodes are offline, cluster discovery mechanism of ES will automatically distribute shard evenly (determined by Decider in ES). They are the highest command in ES to make allocation decisions. When multiple Decider are used, if one decider votes against reassigning a shard, the shard cannot be moved) to ensure that the number of shards on each node is almost equal and can handle all requests in a balanced manner. In general, the value of relocating_shards is 0. So what happens when initializing_shards is not zero? The node is just restarted, or the fragment is just created.

What state is unassigned_shards? The sharding is already in the cluster state, but you can’t find it in the actual cluster. Such as: Two nodes have two primary Shards and each primary Shard has two replica Shards. It is assumed that there are primary Shard R0 and Replica Shard R1 in node 1. Primary Shard R1 and Replica Shard R0 exist in node 2. In this case, two replica Shards are not allocated, because the replica Shard cannot be on the same node as its own primary shard. If unassigned_shards cannot be located on the same node as other replicas of the primary shard (determined by the even_shard fragment allocator), unassigned_shards is set to 2. Generally, the unallocated fragments are replica Shards.

In addition, we can use the ES cat API to see the health of the cluster, but the output of the CAT API is not returned in JSON format, but in a table without the header. For ease of understanding, we can add the v parameter to output the header as well. GET _cat/health? V the results:

3. Index operation

  • Query index:

    GET _cat/indices? v

    Results:

According to the result, you can see the health status, status, index name, UUID, number of primary Shards, number of Replica Shards, number of documents, and number of marked documents of the index. Index size and primary shard size.

  • Create index:

    PUT /indexName

    When creating an index, you can specify the number of primary Shards or the replica Shards configured for a primary shard. If this parameter is not specified, an index has 10 shards by default, including 5 primary shards and 5 Replica Shards, even in a single-node environment. This is a typical over allocation. If you know the data set to be included in the ES, it is a good idea to specify the number of primary shards and replica Shards required when creating the index. Otherwise, the decision needs to be made based on the number of nodes.

For example, create an index order and specify that it contains three primary shards. Each primary Shard is configured with two Replica Shards.

PUT /orders

{

  "settings": {

    "number_of_shards": 3.

    "number_of_replicas": 2

  }

}

Copy the code

Once the index is created successfully, the number of primary Shards cannot be changed. The number of Replica Shards can be changed at any time. Since ES calculates the shard in which the document should be stored according to the following routing algorithm, the number of primary shards cannot be modified once the index is successfully created in order to ensure that the same route value can be obtained when searching the document. Shard = hash(routing) % number_of_primary_shards Number_of_primary_shards Number_of_primary_shards Specifies the id of the document. You can also specify the routing value through the routing parameter.

Each document exists in only one primary shard and replica Shard. Different primary shards have different documents.

When a large amount of data needs to be added, horizontal capacity expansion is usually used, that is, to purchase more servers with the same configuration, rather than vertical capacity expansion, that is, to add servers with larger capacity and higher configuration. The reason for horizontal expansion is, firstly, the cost, secondly, the use of more servers can spread shards to more nodes, improving the fault tolerance of the system, each node has fewer Shards, so more CPU and other resources can be allocated to the shard, and thirdly, easier to expand in the future.

  • Delete index:

    DELETE /indexName

If this post is helpful or enlightening to you, please follow and retweet, your attention and retweet is the biggest support to me. If you have any questions and want free VIP service, please scan the qr code below and follow it to get 1V1 free VIP service.