Introduction to ElasticSearch

@[toc] I’ve come to the end of my series on Spring Security, so I’m going to take some time to put together a series of these tutorials and distribute them to you. Recently I have had a little rest. Although I can go to bed early when I am not writing tutorials, I feel bored. I need to find something to do, and I feel it is time to start a new journey

I wanted to write an ES tutorial when Elasticsearch parent company went public during National Day 2018, but it didn’t work out, it always stuck in my mind. There was a time gap recently, and I was wondering if I could finish the series.

Different from the previous tutorial, this tutorial I plan to create a video version + text and text mixed version. Video is the main, text and text is auxiliary. I will upload the video to baidu net disk, at the end of the article will have the corresponding video download link.

Elasticsearch is pretty popular right now, it’s used for in-site search, log analysis, and it can be used directly as a NoSQL database.

Next, let’s start the journey of ES with the following brief introduction

Songge recorded a video for this article, as follows:

Video download link: https://pan.baidu.com/s/1bIvt… Extract code: PM94

1.Lucene

Lucene is an open source, free, high performance, pure Java written full text search engine, can be regarded as the best full text search kit in the open source field.

In the actual development, Lucene is suitable for almost any scenario that requires full text search, so Lucene has developed many language versions, such as C++, C#, Python, etc.

Back in 2005, Lucene was upgraded to Apache’s top open source project. It’s written by Doug Cutting, and some of you may not have heard of him, but you’ve certainly heard of his other great work Hadoop.

However, it is important to note that Lucene is only a toolkit, not a complete search engine. Developers can develop complete search engines based on Lucene. Solr and ElasticSearch are some of the best known, but in distributed and big data environments, ElasticSearch is much better.

Lucene mainly has the following features:

simple
cross-language
Powerful search engine
Fast indexing
Index files are compatible with different platforms

2.ElasticSearch

ElasticSearch is a distributed, extensible, near-real-time, high-performance search and data analysis engine. ElasticSearch is written in Java. By further encapsulating Lucene, ElasticSearch hides the complexity of search. Developers can operate full-text search with a simple set of RESTful APIs.

Elasticsearch works well in distributed environments, which is one of the reasons for its popularity. It supports petabyte-level processing of large amounts of structured or unstructured data

Overall, ElasticSearch has three main features:

Data collection
The data analysis
Data is stored

Key features of Elasticsearch:

Distributed file storage.
Real-time analysis of distributed search engines.
High scalability.
Pluggable plug-in support.

3. The installation

3.1 Single-node installation

Go to the ES official website and go to Elasticsearch:

https://www.elastic.co/cn/ela…

Then click the download button and select the appropriate version to download directly.

Unzip the downloaded file. The meaning of the directory after unzip is as follows:

directory	meaning
modules	Dependent module directory
lib	Third-party dependent libraries
logs	Output log directory
plugins	Plug-in directory
bin	Directory of executable files
config	Configuration file directory
data	Data storage directory

Start mode:

Go to the bin directory and start./elasticsearch.

Seeing Started means the startup was successful.

The default listening port is 9200, so simply type localhost:9200 in the browser to see the node information.

The name of the node and the name of the cluster (the default is Elasticsearch) can be customized.

Open the config/elasticsearch.yml file to configure the cluster name and node name. The configuration is as follows:

cluster.name: javaboy-es
node.name: master

When the configuration is complete, save the configuration file and restart ES. After a successful restart, refresh the browser’s localhost:9200 page to see the latest information.

ES support matrix:

https://www.elastic.co/cn/sup…

3.2 Head plug-in installation

Elasticsearch-head allows you to visually view cluster information.

Two kinds of installation ideas are introduced here.

3.2.1 Browser plug-in installation

Go to the App Store and search for ElasticSearch-Head and click Install.

Elasticsearch-head can be downloaded offline installation package.

3.2.2 Download and install the plug-in

Four steps

git clone git://github.com/mobz/elasticsearch-head.git
cd elasticsearch-head
npm install
npm run start

Launched successfully, the page is as follows:

Notice that you don’t see the cluster data at this point. The reason is that the cluster data is requested across domains. By default, the cluster does not support cross domains, so the cluster data is not seen here.

The config/elasticsearch.yml configuration file for ES is changed to include the following to enable cross-domain support:

http.cors.enabled: true
http.cors.allow-origin: "*"

When the configuration is complete, restart ES and you have data on HEAD.

3.3 Distributed installation

Assumptions:

A master from 2
The master port is 9200, and the slave port is 9201 and 9202 respectively

/ /elasticsearch.yml ()

Node. The master: true network. Host: 127.0.0.1

After the configuration is complete, restart the master.

Unzip two copies of the ES package, named as Slave01 and Slave02 respectively, which represent two slave machines.

Configure them separately.

Slave01 / config/elasticsearch. Yml:

Cluster. name: javaboy-es node.name: slave01 network.host: 127.0.0.1 http.port: 9201 discovery. Zen. Ping. Unicast. Hosts: [" 127.0.0.1 "]

Slave02 / config/elasticsearch. Yml:

Cluster. name: javaboy-es node.name: slave02 network.host: 127.0.0.1 http.port: 9202 discovery. Zen. Ping. Unicast. Hosts: [" 127.0.0.1 "]

Then start Slave01 and Slave02, respectively. Once started, you can view the cluster information on the HEAD plug-in.

3.4 Kibana installation

Kibana is an Elastic analytics and data visualization platform for ES that searches and views data stored in ES.

Installation steps are as follows:

Download Kibana: https://www.elastic.co/cn/dow…
Unpack the
– Configure the address information of es (optional, if es is the default address and port, you can not configure it, the specific configuration file is config/kibana.yml)
Execute the./bin/kibana file to start
localhost:5601

Once Kibana is installed, the first time you open it, you can choose to initialize or not use the test data provided by ES.

4. Introduction to ElasticSearch Core Concepts

4.1 Ten core Elasticsearch concepts

4.1.1 Cluster

One or more servers installed on ES nodes are organized together, which is called a cluster. These nodes jointly hold data and provide search services.

The default cluster name is Elasticsearch. If a cluster has the same name as the Elasticsearch cluster, it will be the same as the Elasticsearch cluster.

You can configure the cluster name in the config/elasticsearch.yml file:

cluster.name: javaboy-es

In a cluster, there are three states of nodes: green, yellow, and red:

Green: The node is running in a healthy state. All primary sharding and duplicate sharding can work normally.
Yellow: indicates that the node is running in a warning state. All primary shards are currently running directly, but at least one replica shard is not working properly.
Red: indicates that the cluster is not working properly.

4.1.2 Nodes

A server in a cluster is a node that stores data and participates in the indexing and searching functions of the cluster. A node that wants to join a cluster simply configures the cluster name. By default, if we start multiple nodes and the nodes are able to discover each other, they will automatically form a cluster, which is what ES provides by default, but this approach is not reliable and can lead to brain-splitting. Therefore, in practical use, it is recommended that you manually configure the cluster information.

4.1.3 Index

Indexes can be understood in two ways:

noun

A collection of documents with similar characteristics.

The verb

Index data and perform index operations on the data.

4.1.4 Type

A type is a logical category or partition on an index. Before ES6, there could be multiple types in an index. Starting with ES7, there could be only one type in an index. In ES6.x, compatibility is maintained and multiple types for a single index are still supported, but their use is no longer recommended.

4.1.5 Document

A unit of data that can be indexed. For example, a user’s documentation, a product’s documentation, and so on. The documents are in JSON format.

4.1.6 Shards

Indexes are all stored on nodes. However, due to the limitation of the space size and data processing capacity of nodes, the processing effect of a single node may not be ideal. In this case, we can conduct index sharding. When we create an index, we need to specify the number of shards. Each shard is itself a fully functional and independent index.

By default, 1 shard is automatically created for an index and a copy is created for each shard.

4.1.7 Replicas

A replica is also known as a backup, a backup of the primary shard.

4.1.8 Settings

Information about the definition of the index in the cluster, such as the number of sharding, number of copies, and so on.

4.1.9 Mapping

Mapping stores information that defines the type of storage for the index field, how to segment the word, and whether to store it or not.

4.1.10 Analyzer

The definition of the field segmentation method.

4.2 ElasticSearch Vs Relational Database

Relational database	ElasticSearch
The database	The index
table	type
line	The document
column	field
Table structure	Mapping
SQL	DSL(Domain Specific Language)
Select * from xxx	GET http://
update xxx set xx=xxx	PUT http://
Delete xxx	DELETE http://
The index	The full text indexing

This article is actually the scene recorded Es video tutorial notes, notes some relatively simple, friends can also refer to the video, video download link: https://pan.baidu.com/s/1bIvt… Extract code: PM94