1. Know ES

Two, ES function

Distributed, search (full text search, structured search), data analysis

III. The past and present lives of Lucene and ES

Lucene is the most advanced and powerful search library, developed directly from Lucene, and very complex.

Elasticsearch, based on Lucene, hides complexity and provides a simple and easy to use API.

IV. The core concept of ES

  1. Near Realtime(NRT): Near Realtime, there is a small delay (about 1 second) from the time data is written until it can be searched; Searching and analysis based on ES can be performed in seconds
  2. Cluster: A Cluster with multiple nodes, each node belonging to which Cluster is determined by a configuration (the default Cluster name is: elasticsearch).
  3. The Node: The default node will join a Elasticsearch cluster. If you start a bunch of nodes directly, they will automatically form a Elasticsearch cluster. The current node can also form a Elasticsearch cluster.
  4. Document: Document is the smallest data unit in ES. Multiple documents can be stored in the type under each index. There are multiple fields in a Document, and each field is a data field.
  5. Index: An Index that contains a bunch of document data with a similar structure.
  6. Type: Type. Each index can have one or more types.
  7. Shard: A single machine cannot store a large amount of data. ES can split the data in an index into multiple shards, which can be stored on multiple servers. With shards, you can scale horizontally, store more data, and spread operations such as search and analysis across multiple servers, improving throughput and performance.
  8. Replica: One server can fail or go down at any time, so shards can be lost, so multiple replicas can be created for each shard. Replica can provide backup services in the event of a shard failure, ensuring that data is not lost. The default number of replica shards in each index is 10, 5 primary shards and 5 replica shards. The minimum available configuration is the number of replica shards. It’s two servers.

Five, simple cluster management

Get /_cat/health? How can I quickly learn the health status of a cluster?

  • Green: The primary shard and replia shard of each index are in the ACTIVE state
  • YELLOW: The number of primary shards in each index is ACTIVE, but some replica shards are not
  • Red: Not all primary shards are active. Some indexes are missing data

GET / _CAT/INDEXES PUT /test_index = test_index; Pretty DELETE index :DELETE /test_index? pretty

Simple CRUD operations

Add PUT/index/type/id
Query the GET/index/type/id
POST /index/type/id/_update{"doc":{"key":"value"}}
DELETE the DELETE/index/type/id

Seven, a variety of search methods

1. Query String Search: Basically not used

    GET: /index/type/_search

The response,

  • Took: It took milliseconds
  • Timed_out: Whether to time out
  • _shards: Data is split into five shards, so all primary shrads are hit for search requests
  • HITS -> TOTAL: Number of query results
  • HITS -> MAX_SCORE: The meaning of score is the matching score of document for the relevance of a search. The more relevant, the higher the matching score is
  • Hits. Hits: Contains the detailed data of the Ducument matching the search

Query the reverse order

GET /index/type/_search? q=key:value&sort=key:desc

2.DSL(Domain Specified Language) queries all types of data

GET /index/type/_search
{
    "query":{"match_all":{}}
}

Select * from key where a key contains value, and sort it in descending order

GET /index/type/_search
{
    "query":{
        "match":{
            "key":"value"
        }
    },
    "sort":[
        {"key":"desc"}
    ]
}

Paging query, suppose a total of 3 data, each page will display 1 item, now display page 2, so the second item is found

GET/index/type / _search {" query ": {" match_all" : {}}, "from" : 1, / / since the first check "size" : 1 / / to check how much}

Specifies the key to be queried

GET /index/type/_search
{
    "query":{"match_all":{}},
    "_source":["key1","key2"]
}

3.query filter

GET/index/type / _search {" query ": {" bool" : {" must ": {/ / must must match the" match ": {" key" : "value"}}, "Filter" : {/ / filter filter "range" : {" key ": {value}" gt ": / / gt said than}}}}}

4. Full-text search

GET /index/type/_search
{
    "query":{
        "match":{
            "key" :"value  value"
        }
    }
}

5. Phrase search is the opposite of full-text search, which breaks down the input search strings and matches them one by one in the inverted index. As long as any word can be matched, it will be returned as the result. Phrase Search, which requires that the search string entered must contain exactly the same words in the specified field text in order to be considered a match and returned as a result

GET /index/type/_search
{
    "query":{
        "match_phrase":{"key":"value"}
    }
}

6. Highlight search

GET /index/type/_search
{
    "query" : {
       "match":{"key":"value"}
    },
    "highlight":{
        "fields":{"key":{}}
    }
}

Eight, polymerization analysis

GET/index/type / _search {" size ": 0, / / return only polymerization" aggs ": {/ / aggregation" group_by ": {/ / the variable name (take) random" terms ": {" field" : "the key"} / / press a key to the group, Find the number of each group}}}
GET/index/type / _search {" size ": 0," query ": {" match" : {" key ":" value "}} / / first search, "Aggs" : {" all_tags ": {" terms" : {" field ":" the key "} / / and then grouping}}}
/ / use the average value of the group before they GET/index/type / _search {" size ": 0," aggs ": {" group_by" : {" terms ": {" field" : "the key"}, "aggs":{ "avg_xx":{ "avg":{"field":"key"} } } } } }
GET /index/type/_search {"size":0, "aggs":{ "group_by":{ "terms":{"field":"key","order":{"abc":"desc"}}, "aggs":{ "abc":{ "avg":{"field":"value"} } } } } }
/ / grouped by interval GET/index/type / _search {" aggs ": {" group_by" : {" range ": {" field" : "key", "ranges" : [{" from ": 0," to ", 20}, { "from":20, "to":40 }, { "from":40, "to":60 } ] } } } }