The profile

This article focuses on the simple gameplay of search templates, mapping templates, highlighting search, and geolocation.

Standard search template

One of the advanced features of Search Tempalte allows us to templatesome of our searches by passing in the specified parameters when using an existing template to avoid writing duplicate code. Common functions can be encapsulated by templates, which is easier to use.

This is similar to the interface encapsulation in our programming. Some details are encapsulated into interfaces for others to call. Users only need to pay attention to parameters and response results, which can better improve code reuse rate.

Let’s look at some of the most basic uses

Parameters to replace

GET /music/children/_search/template
{
  "source": {
    "query": {
      "match": {
        "{{field}}":"{{value}}"}}},"params": {
    "field":"name"."value":"bye-bye"}}Copy the code

When compiled, this search template is equivalent to:

GET /music/children/_search
{
  "query": {
    "match": {
      "name":"bye-bye"}}}Copy the code

Conditional query using Json format

The {{#toJson}} block allows you to write slightly more complex conditions

GET /music/children/_search/template
{
  "source": "{\"query\":{\"match\": {{#toJson}}condition{{/toJson}}}}"."params": {
    "condition": {
      "name":"bye-bye"}}}Copy the code

This search template is equivalent to the following when compiled:

GET /music/children/_search
{
  "query": {
    "match": {
      "name":"bye-bye"}}}Copy the code

Join syntax

The join parameter names can be written to multiple values:

GET /music/children/_search/template
{
  "source": {
    "query": {
      "match": {
        "name": "{{#join delimiter=' '}}names{{/join delimiter=' '}}"}}},"params": {
    "name": ["gymbo"."you are my sunshine"."bye-bye"]}}Copy the code

This search template is equivalent to the following when compiled:

GET /music/children/_search
{
  "query": {
    "match": {
      "name":"gymbo you are my sunshine bye-bye"}}}Copy the code

Default Settings for search templates

The search template can be set to some default values, such as {{^end}}500 to indicate that if the end parameter is empty, the default value is 500

GET /music/children/_search/template
{
  "source": {"query": {"range": {"likes": {"gte":"{{start}}"."lte":"{{end}}{{^end}}500{{/end}}"}}}},"params": {
    "start":1."end":300}}Copy the code

When compiled, this search template is equivalent to:

GET /music/children/_search
{
  "query": {
    "range": {
      "likes": {
        "gte": 1."lte": 300}}}}Copy the code

conditional

In Mustache, it doesn’t have an if/else, but you can have sections to skip it if that variable is false or if it’s not defined

{{#param1}}
    "This section is skipped if param1 is null or false"
{{/param1}}
Copy the code

Example: Creating a Mustache Scripts object

POST _scripts/condition
{
  "script": {
    "lang": "mustache",
    "source": 
    """
        {
        	"query": {
              "bool": {
                "must": {
                  "match": {
                    "name": "{{name}}"
                  }
                },
                "filter":{
                  {{#isLike}}
                    "range":{
                      "likes":{
                        {{#start}}
                          "gte":"{{start}}"
                          {{#end}},{{/end}}
                        {{/start}}
                        {{#end}}
                          "lte":"{{end}}"
                        {{/end}}
                      }
                    }
                  {{/isLike}}
                }
              }
            }
        }
    """
  }
}
Copy the code

With mustache Template query:

GET _search/template
{
    "id": "condition"."params": {
      "name":"gymbo"."isLike":true."start":1."end":500}}Copy the code

The above is a description of common search templates. If a large project is configured with a dedicated Elasticsearch engineer, it will use some common functions to template. You only need to use templates to develop business systems.

Customizing mapping Templates

ES has its own rules for type mapping of inserted data, such as 10, which is automatically mapped to long, “10” to text, and has a keyword built-in field. Convenience is very convenient, but sometimes these types are not what we want, like our integer value 10, we want this integer type, and “10” we want the keyword type, so we can define a template, and when we insert data, the related fields will match according to our predefined rules, Determine the type of the field value.

In addition, it should be declared that the coding specifications in the actual work are generally rigorous. All documents are pre-defined before data insertion, even if the field is added in the middle of the process, the mapping command is first executed before data insertion.

But custom dynamic mapping templates also need to be understood.

Default dynamic mapping effect

Try inserting a piece of data:

PUT /test_index/type/1
{
  "test_string":"hello kitty"."test_number":10
}
Copy the code

Viewing mapping Information

GET /test_index/_mapping/type

The response is as follows:

{
  "test_index": {
    "mappings": {
      "type": {
        "properties": {
          "test_number": {
            "type": "long"
          },
          "test_string": {
            "type": "text"."fields": {
              "keyword": {
                "type": "keyword"."ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}
Copy the code

The default dynamic mapping rule may not be what we want.

For example, we want the default number type to be integer, and the default string type to be string, but the built-in field name is RAW, not keyword, and reserved for 128 characters.

Dynamic mapping template

There are two ways:

  1. Matches a predefined template based on the default data type of the newly added field
  2. Match a predefined name based on the name of the newly added field, or match a predefined wildcard, and then match a predefined template
Matches by data type
PUT /test_index
{
  "mappings": {
    "type": {
      "dynamic_templates": [{"integers" : {
            "match_mapping_type": "long"."mapping": {
              "type":"integer"}}}, {"strings" : {
            "match_mapping_type": "string"."mapping": {
              "type":"text"."fields": {
                "raw": {
                  "type": "keyword"."ignore_above": 128}}}}}]}}}Copy the code

Delete the index, insert the data again, and view the mapping information as follows:

{
  "test_index": {
    "mappings": {
      "type": {
        "dynamic_templates": [{"integers": {
              "match_mapping_type": "long"."mapping": {
                "type": "integer"}}}, {"strings": {
              "match_mapping_type": "string"."mapping": {
                "fields": {
                  "raw": {
                    "ignore_above": 128."type": "keyword"}},"type": "text"}}}]."properties": {
          "test_number": {
            "type": "integer"
          },
          "test_string": {
            "type": "text"."fields": {
              "raw": {
                "type": "keyword"."ignore_above": 128
              }
            }
          }
        }
      }
    }
  }
}
Copy the code

To map to the expected type, as expected.

  • Map by field name
  • Field starting with “long_”, and originally of type long, is converted to integer
  • A field starting with “string_” and originally a string is converted to a field ending with “_text” of string.raw and originally a string, unchanged
PUT /test_index
{
  "mappings": {
    "type": {
      "dynamic_templates":[
       {
         "long_as_integer": {
	         "match_mapping_type":"long"."match": "long_*"."mapping": {"type":"integer"}}}, {"string_as_raw": {
	         "match_mapping_type":"string"."match": "string_*"."unmatch":"*_text"."mapping": {
              "type":"text"."fields": {
                "raw": {
                  "type": "keyword"."ignore_above": 128}}}}}]}}}Copy the code

Insert data:

PUT /test_index/type/1
{
  "string_test":"hello kitty"."long_test": 10."title_text":"Hello everyone"
}
Copy the code

Querying mapping Information

{
  "test_index": {
    "mappings": {
      "type": {
        "dynamic_templates": [{"long_as_integer": {
              "match": "long_*"."match_mapping_type": "long"."mapping": {
                "type": "integer"}}}, {"string_as_raw": {
              "match": "string_*"."unmatch": "*_text"."match_mapping_type": "string"."mapping": {
                "fields": {
                  "raw": {
                    "ignore_above": 128."type": "keyword"}},"type": "text"}}}]."properties": {
          "long_test": {
            "type": "integer"
          },
          "string_test": {
            "type": "text"."fields": {
              "raw": {
                "type": "keyword"."ignore_above": 128}}},"title_text": {
            "type": "text"."fields": {
              "keyword": {
                "type": "keyword"."ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}
Copy the code

The results were in line with expectations.

In some log management scenarios, we can define type and create an index by day. This index can be created using a mapping template that contains all the mappings we define.

Highlighting the search

When we search for text in the browser, we find that the keyword we entered is highlighted. Looking at the HTML source code, we know that the highlighted part is with the tag. ES also supports the operation of highlighting search, and the returned document automatically adds the tag, compatible with HTML5 pages.

Highlight basic grammar

Again, let’s start with a highlighted search on a music site:

GET /music/children/_search 
{
  "query": {
    "match": {
      "content": "love"}},"highlight": {
    "fields": {
      "content": {}}}}Copy the code

Highlight the inside of the parameters for search syntax highlighting, namely the field at the specified highlighting the content, we can see the Love inside of hit with the < em > highlighting tags, < em > < / em > in the HTML will turn red, so that you specified in the field, if contain the search term, The search term will be highlighted in red in the text of that field.

{
  "took": 35."timed_out": false."_shards": {
    "total": 5."successful": 5."skipped": 0."failed": 0
  },
  "hits": {
    "total": 1."max_score": 0.2876821."hits": [{"_index": "music"."_type": "children"."_id": "5"."_score": 0.2876821."_source": {
          "id": "1740e61c-63da-474f-9058-c2ab3c4f0b0a"."author_first_name": "Jean"."author_last_name": "Ritchie"."author": "Jean Ritchie"."name": "love somebody"."content": "love somebody, yes I do"."language": "english"."tags": "love"."length": 38."likes": 3."isRelease": true."releaseDate": "2019-12-22"
        },
        "highlight": {
          "content": [
            "<em>love</em> somebody, yes I do"}}]}}Copy the code

The fields under highlight can be specified more than one so that the keywords hit by multiple fields can be highlighted, for example:

GET /music/children/_search 
{
  "query": {
    "match": {
      "content": "love"}},"highlight": {
    "fields": {
      "name": {},"content": {}}}}Copy the code

Three syntax highlights

There are three syntax types highlighted:

  1. Plain Highlight: Excellent support for simple queries using Standard Lucene highlighter.
  2. Unified Highlight: The default highlighting syntax, using Lucene Unified Highlighter, cuts text into sentences, and uses BM25 to calculate the score for sentences, supporting precise query and fuzzy query.
  3. Fast vector highlighter: It is powerful to use Lucene Fast Vector highlighter. If terM_vector is enabled for field mapping and with_POSITIONS_offsets is set, this highlighter is used. There are performance advantages for very long text (greater than 1MB).

Such as:

PUT /music
{
  "mappings": {
    "children": {
      "properties": {
        "name": {
          "type": "text"."analyzer": "ik_max_word"
        },
        "content": {
          "type": "text"."analyzer": "ik_max_word"."term_vector" : "with_positions_offsets"
        }
      }
    }
  }
}
Copy the code

In general, plain Highlight is sufficient and no additional Settings are required. If performance requirements for highlighting are high, you can try to enable Unified Highlight. If the field value is extremely large, over 1M, So we can use fast vector highlight

Custom highlighting OF HTML tags

We know that the default tag for highlighting is . This tag can be customized and used in any style you like:

GET /music/children/_search 
{
  "query": {
    "match": {
      "content": "Love"}},"highlight": {
    "pre_tags": ["<tag1>"]."post_tags": ["</tag2>"]."fields": {
      "content": {
        "type": "plain"}}}}Copy the code

Highlight fragment fragment Settings

For some long text, it is impossible to display the whole text on the page. We need to display only the context with the keyword.

GET /_search
{
    "query" : {
        "match": { "content": "friend"}},"highlight" : {
        "fields" : {
            "content" : {"fragment_size" : 150."number_of_fragments" : 3."no_match_size": 150}}}}Copy the code

Fragment_size: Sets the length of fragment text to be displayed. Default is 100.

Number_of_fragments: You may have multiple fragments in your highlighted fragment text. You can specify how many fragments to display.

The geographical position

There are a lot of location-based apps out there. Elasticsearch is no exception, and ES combines location-based, full-text, structured search, and analytics.

Geo Point data type

Elasticsearch is a location-based search with a special object called geo_point that stores location information (latitude, longitude) and provides basic query methods such as geo_bounding_box.

Create a mapping of type geo_point

PUT /location
{
  "mappings": {
    "hotels": {
      "properties": {
        "location": {
          "type": "geo_point"
        },
        "content": {
          "type": "text"
        }
      }
    }
  }
}
Copy the code
Insert data

You are advised to insert data in the following ways:

#latitude: longitude PUT /location/hotels/1 {"content":"7days hotel", "location": {"lon": 113.928619, "lat": 22.528091}}Copy the code

There are two other ways to insert data, but it is particularly easy to confuse latitude and longitude positions, so it is not recommended:

PUT /location/hotels/2 {"content":"7days hotel ", "location": "7days hotel" [113.923567,22.523988]} # PUT /location/hotels/3 {"text": "7days Orient Sunseed Hotel", "Location ": "22.521184, 113.914578"}Copy the code
Query methods

The geo_bounding_box query queries coordinates within the geographic location range of a rectangle

GET /location/hotels/_search
{
  "query": {
     "geo_bounding_box": {
      "location": {
        "top_left": {"lon": 112."lat": 23
        },
        "bottom_right": {"lon": 114."lat": 21
        }
      }
    } 
  }
}
Copy the code

Common Query Scenarios

Geo_bounding_box way
GET /location/hotels/_search
{
  "query": {
    "bool": {
      "must": [{"match_all": {}}],"filter": {
        "geo_bounding_box": {
          "location": {
            "top_left": {"lon": 112."lat": 23
            },
            "bottom_right": {"lon": 114."lat": 21
            }
          }
        }
      }
    }
  }
}
Copy the code
Geo_polygon, a polygon (triangular) region composed of three points

Support polygons, but this filter is very expensive to use, try to use less.

GET /location/hotels/_search
{
  "query": {
    "bool": {
      "must": [{"match_all": {}}],"filter": {
        "geo_polygon": {
          "location": {
            "points": [{"lon": 115."lat": 23},
              {"lon": 113."lat": 25},
              {"lon": 112."lat": 21}]}}}}}}Copy the code
Geo_distance way

It is very useful to search by the distance of the current location

GET /location/hotels/_search
{
  "query": {
    "bool": {
      "must": [{"match_all": {}}],"filter": {
        "geo_distance": {
          "distance": 500."location": {
            "lon": 113.911231."lat": 22.523375
          }
        }
      }
    }
  }
}
Copy the code
Sort by distance

A conditional search based on the current location will specify an upper limit of distance, 2km or 5km, and the results of the conditional query will display the distance from the current location (you can specify units), sorted from near to far, which is a very common scenario.

Example request:

GET /location/hotels/_search
{
  "query": {
    "bool": {
      "must": [{"match_all": {}}],"filter": {
        "geo_distance": {
          "distance": 2000."location": {
            "lon": 113.911231."lat": 22.523375}}}}},"sort": [{"_geo_distance": {
        "location": { 
          "lon": 113.911231."lat": 22.523375
        },
        "order":         "asc"."unit":          "m"."distance_type": "plane"}}}]Copy the code
  • Filter.geo_distance.distance: maximum distance, here is 2000m
  • _geo_distance: fixed notation, below is the latitude and longitude of the specified position
  • Order: sort by asC or desc
  • Unit: Distance in m/km
  • Distance_type: Method of calculating distance, sloppy_arc (default), arc (accurate) and plane (fastest)

The response is as follows:

"hits": [{"_index": "location"."_type": "hotels"."_id": "3"."_score": null."_source": {
          "text": "7days hotel Orient Sunseed Hotel"."location": "22.521184, 113.914578,"
        },
        "sort": [
          421.35435857277366] {},"_index": "location"."_type": "hotels"."_id": "2"."_score": null."_source": {
          "content": "7days hotel"."location": [
            113.923567.22.523988]},"sort": [
          1268.8952707727062]}Copy the code

Sort is the distance to the ground from the current position in m.

Count the hotels within a few blocks of my current location

Unit stands for distance unit, commonly used mi and km.

Distance_type indicates how to calculate distance, sloppy_arc (default), arc (accurate) and plane (fastest).

GET /location/hotels/_search
{
  "size": 0."aggs": {
    "group_by_distance": {
      "geo_distance": {
        "field": "location"."origin": {
          "lon": 113.911231."lat": 22.523375
        },
        "unit": "mi"."distance_type": "arc"."ranges": [{"from": 0."to": 500},
          {"from": 500."to": 1500},
          {"from": 150."to": 2000}]}}}}Copy the code

summary

This article briefly introduces the simple gameplay of search templates, mapping templates, highlighting search, and geolocation. Some ES related projects go deeper, and search templates and mapping templates are very useful. Highlighted searches are typically found in browser search engines, but location-based apps are interesting, as well as location-based apps.

Focus on Java high concurrency, distributed architecture, more technical dry products to share and experience, please pay attention to the public account: Java architecture community can scan the left QR code to add friends, invite you to join the Java architecture community wechat group to discuss technology