@ (toc) this article is scene recorded a video tutorial notes, notes concise, complete friends can refer to the video content, video download link: https://pan.baidu.com/s/1NHoe… Extract code: KZV7

1. Introduction to Elasticsearch word segmentation

1.1 Built-in word splitter

The core functionality of Elasticsearch is data retrieval, starting with indexing documents to ES. Query analysis is mainly divided into two steps:

  1. Itemization: The word splitter converts the input text into a stream of words.
  2. Filtering: For example, the Stop Word Filter will remove irrelevant entries from the entries. There are also synonym filters, lowercase filters, and so on.

There are a number of word segmentation tools built into Elasticsearch for you to use.

Built-in word splitter:

Word segmentation is role
Standard Analyzer Standard word divider, suitable for English, etc.
Simple Analyzer Simple word splitter. Word splitter is based on non-alphabetic characters. Words are converted to lowercase letters.
Whitespace Analyzer Space word splitter. Shard according to Spaces.
Stop Analyzer Similar to a simple word splitter, but with the ability to stop words.
Keyword Analyzer Keyword word splitter, input text equals output text.
Pattern Analyzer Use regular expressions to segment text and support stop words.
Language Analyzer A language-specific word splitter.
Fingerprint Analyzer Fingerprint analyzer word splitter, by creating a marker for repeated detection.

1.2 Chinese word splitter

Elasticsearch-analysis-ik is a third party plugin for ES. The code for Elasticsearch-analysis-ik is on GitHub:

  • https://github.com/medcl/elas…

1.2.1 installation

Two ways of use:

The first:

  1. First open the participle organ net: https://github.com/medcl/elas… .
  2. In https://github.com/medcl/elas… Page to find the latest official version, download down. The download link here is https://github.com/medcl/elas… .
  3. Unzip the downloaded file.
  4. In the es/plugins directory, create a new IK directory and copy all the extracted files to the IK directory.
  5. Restart the ES service.

The second:

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.9.3/elasticsearch-analysis-ik-7.9.3.zip

1.2.2 test

After a successful reboot of es, first create an index named test:

Next, a word segmentation test is performed in the index:

1.2.3 Custom extended thesaurus Local customization

In the es/plugins/ik/config directory, create a new Ext.dic file (any name) where you can configure your own lexicon.

If there are more than one word, the new word is added to the newline.

Then in the es/plugins/ik/config/IKAnalyzer CFG. The location of the XML configuration extensions dictionary: Remote Thesaurus

You can also configure the remote thesaurus, which supports hot updates (to take effect without restarting the ES).

A hot update simply provides an interface that returns an extension word.

The specific way to use it is as follows: create a new Spring Boot project and introduce Web dependencies. Then create a new Ext.dic file in your Resources/Stastic directory and write the extension words:

Next, in es/plugins/ik/config/IKAnalyzer CFG. Configure the remote in the XML file extensions word interface:

After the configuration is complete, restart ES and it will take effect.

Hot-update, mainly when the Last-Modified or ETag field of the response header changes, will automatically reload the remote extended dictionary.

ElasticSearch04 = ElasticSearch04 = ElasticSearch04 = ElasticSearch04

2. Elasticsearch index management

Go to ElasticSearch05 to download the elasticSearch05 script.

Start one master node and two slave nodes for testing (see the setup video in Episode 2).

2.1 New Index

2.1.1 Create a new index through the HEAD plugin

In the Head plug-in, select the Index TAB, and then click New Index. When creating a new index, you need to fill in the index name, number of shards, and number of copies.

After the index is created successfully, it looks like this:

0, 1, 2, 3, and 4 respectively represent the index shard, the thick box represents the main shard, the thin box represents the copy (click the box, you can check whether it is the main shard or the copy through the primary attribute). The.kibana index has only one shard and one copy, so it has only 0.

2.1.2 Created by request

Requests can be sent through PostMan or through Kibana, which is used here because Kibana is prompted.

Create index request:

PUT book

After the creation is successful, you can view the index information:

Two points need to be noted:

  • Index names cannot have uppercase letters

  • The index name is unique and cannot be repeated. Repeating the name creates an error

2.2 Update Index

Once the index is created, you can modify its properties.

For example, change the number of copies of an index:

PUT book/_settings
  "number_of_replicas": 2

After successful modification, it is as follows:

The same goes for updating the shard count.

2.3 Change the read and write permissions of the index

After the index is created successfully, you can write documents to the index:

PUT book/_doc/1 {"title":" Romance of Three Kingdoms "}

After the write is successful, you can view it in the HEAD plug-in:

By default, indexes have read and write permission, which can be turned off.

For example, turn off write permission for an index:

PUT book/_settings
  "blocks.write": true

Once closed, you cannot add a document. After you have turned off write permission, if you want to turn it on again, you can do this:

PUT book/_settings
  "blocks.write": false

Other similar permissions are:

  • blocks.write
  • blocks.read
  • blocks.read_only

2.4 Viewing Index

The Head plugin can be viewed as follows:

Request to view as follows:

GET book/_settings

You can also view multiple index information at the same time:

GET book,test/_settings

You can also view all index information:

GET _all/_settings

2.5 Delete index

The HEAD plugin can delete indexes:

Request to delete as follows:


Deleting a nonexistent index results in an error.

5.6 Index on/off

Close index:

POST book/_close

Open index:

POST book/_open

Of course, you can close/open multiple indexes at the same time, separate them, or simply use _all to represent all indexes.

2.7 Replication Index

Index replication, which only copies data, not the index configuration.

POST _reindex
  "source": {"index":"book"},
  "dest": {"index":"book_new"}

When copying, you can add query criteria.

2.8 Index Alias

You can create an alias for an index, which can replace the index name if the alias is unique.

POST /_aliases
  "actions": [
      "add": {
        "index": "book",
        "alias": "book_alias"

Add the results as follows:

Change add to remove to remove alias:

POST /_aliases
  "actions": [
      "remove": {
        "index": "book",
        "alias": "book_alias"

To view aliases for an index:

GET /book/_alias

(book_alias represents an alias) :

GET /book_alias/_alias

You can view all available aliases on the cluster:

GET /_alias

Finally, Songge also collected more than 50 project requirements documents, want to do a project practice friends may wish to look at Oh ~

The requirements document address: https://github.com/lenve/javadoc