preface


Now that you’ve seen what Elasticsearch is and what an inverted index is, let’s get familiar with some of the terms used in ES.


[liuzhirichard] Record technology, development and source code notes in work and study. From time to time, share what you’ve seen and heard in your life. Welcome to guide!

The term is commonly used

noun explain
cluster If one or more nodes specify the same cluster name, they cluster and automatically elect the master, as well as in the event of a failure.
node The node is a running instance of Elasticsearch that belongs to the cluster. At startup, the node will use unicast to discover an existing cluster with the same cluster name and will try to join the cluster.
index A table similar to a relational database that maps one or more master shards and has zero or more replica shards.
index alias An index alias is a secondary name used to reference one or more existing indexes. Most Elasticsearch apis accept index aliases instead of index names.
mapping Each index has a mapping that defines a Type and a number of index range Settings. Mapping can be explicitly defined or generated automatically after the document is indexed.
shard A sharding is a single Lucene instance. Smallest unit of work, automatically managed by Elasticsearch. Indexes are logical namespaces that point to master and replica shards.
primary shard Each document is stored in a master shard. When you index a document, you index it first on the master SHard, and then on all copies of the master SHard. By default, an index has a primary shard. You can specify more major shards to expand the number of documents that can be processed by the index. After an index is created, you cannot change the number of major shards in the index. However, indexes can be split into new indexes using the Split API.
replica shard Each master shard can have zero or more copies. The copy is a copy of the primary shard.
document Document is a JSON document stored in Elasticsearch. Each document is stored in an index and has a Type and id. Indexed JSON documents are stored in the _source field, which is returned by default when documents are retrieved or searched.
id Each document has a different ID, which is generated automatically if not specified.
field A document contains a list of fields or key-value pairs. Fields are similar to columns in a table in a relational database.
source field By default, the indexed JSON document is stored in the _source field and will be returned by all GET and search requests. In this way, the original object can be accessed directly from the search results without having to perform a second step to retrieve the object from the ID.

So the graph looks something like this

What is the use of replica Shard?

  1. Added failover: If the primary copy fails, the replica copy can be promoted to the primary copy

  2. Improved performance: Fetch and search requests can be processed by master or replica sharding.

    By default, there is one copy per master shard, but the number of copies can be dynamically changed on existing indexes. Replica shards are never started on the same node as their master shards.

You do not need to refer to shards directly, other than defining the number of master and replica shards an index should have. Instead, your code should only deal with indexes.

Elasticsearch allocates shards between all nodes in the cluster and can automatically move shards from one node to another in case of node failure or new nodes are added.

The default value is 5 fragments and 1 copy.

conclusion

This article briefly introduces the common nouns of ES, because only when you know these nouns, you will not feel confused when talking about ES.