“This is the sixth day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

With an index library, you have a database in a database. Next you need the types in the index library, that is, the tables in the database. Field constraints need to be set to create a database table, and the same is true for an index library. When creating the type of an index library, you need to know what fields exist under this type and what constraint information each field has, which is called field mapping.

Note: elasticSearch7.x has removed the index type (_doc by default) from the index type (_doc by default), but the field is still available.

Field constraints we have all seen in Lucene, including but not limited to:

  • The data type of the field
  • Whether to store
  • Index or not
  • Whether the word segmentation
  • What is a participle

Creating a field map

grammar

The request is still PUT

PUT/Index library name /_mapping/typeName {"properties": {" Field name ": {"type": "type", "index": true, "store": true, "Analyzer ": "Participle"}}}Copy the code
  • Type name: is the concept of type, similar to the database table field name: arbitrary fill, specify a number of attributes, such as:
    • Type: indicates the type. The value can be text, keyword, long, short, date, INTEGER, or object
    • Index: indicates whether to index. The default value is true
    • Store: Indicates whether to store. The default value is false
    • Analyzer: word analyzer. Ik_max_word uses the IK word analyzer

The sample

Initiate a request:

PUT lagou/_mapping/goods
{
    "properties": {
        "title": {
            "type": "text"."analyzer": "ik_max_word"
        },
        "images": {
            "type": "keyword"."index": "false"
        },
        "price": {
            "type": "float"}}}Copy the code

Response results:

{
    "acknowledged": true
}
Copy the code

In this example, we added a type named goods to the lagou index library and set three fields in the type:

  • Title: product title
  • Images: Pictures of goods
  • Price: commodity price

Viewing a Mapping

Grammar:

GET/Index database name /_mappingCopy the code

View all types of mappings in an index library. If you want to view a type map, you can follow the path with the type name. That is:

GET/index library name /_mapping/ Type nameCopy the code

Example:

GET /lagou/_mapping/goods
Copy the code

Response:

{
    "lagou": {
        "mappings": {
            "goods": {
                "properties": {
                    "images": {
                        "type": "keyword"."index": false
                    },
                    "price": {
                        "type": "float"
                    },
                    "title": {
                        "type": "text"."analyzer": "ik_max_word"
                    }
                }
            }
        }
    }
}
Copy the code

Mapping properties in detail

type

The primary classification The secondary classification The specific type use
Core types String type text,keyword Structured search, full text search, aggregation, sorting and so on
Integer types integer,long,short,byte The shorter the length of the field, the more efficient the indexing and searching.
Floating point types double,flfloat,half_flfloat,scaled_flfloat
Logic type boolean
The date type date
The scope of types range
Binary type binary The binary type accepts a binary value as a Base64 encoded string. This field is not stored by default and is not searchable
The compound type An array type array
Object type object Used for a single JSON object
Nested types nested For JSON object arrays
Geographic types Geographic coordinate type geo_point Latitude/longitude integral
Geographic map geo_shape Used for complex shapes such as polygons
Special type IP type ip Used for IPv4 and IPv6 addresses
The scope of types completion Provide auto-completion suggestions
Token count type token_count Count the number of tokens in the string

Let’s say a few key ones:

  • String ();
    • Text: Fields that use the text data type. They are segmented. Text fields are not used for sorting and are rarely used for aggregation, such as article titles and body text.
    • Keyword: Data type: a field used to index structured content that is not segmented and must fully match content, such as email, id number. Support the aggregation

Both types are common, but there are times when you might want to support both for a string field and take advantage of its multi-field nature

"properties": {
    "title": {"type": "text"."analyzer": "ik_max_word"."fields": {
            "sort": {"type": "keyword"}},"index": true}}Copy the code
  • Numerical: there are two types of Numerical values
    • Basic data types: Long, Interger, short, byte, double, flfloat, halF_flfloat
    • Double double precision 64 bits
    • Flfloat single precision 32 bits
    • Half_flfloat semi-precision 16 bits
    • High precision type of floating point numbers: scaled_flfloat
      • A scale type floating point number with a scale factor that relies on a long numeric type to be scaled by a fixed (type double) scale factor.
      • You need to specify a precision factor, such as 10 or 100. Elasticsearch multiplies the actual value by this factor and stores it, then restores it when retrieved.
  • Date: indicates the Date type
    • Elasticsearch can format dates as strings, but it is recommended that we store them as milliseconds and store them as longs to save space.
  • Array: indicates the Array type
    • If any element is matched, it is considered to be matched
    • When sorting, the smallest value in the array is used for ascending order, and the largest value is used for descending order
String array: ["one"."two"] integer array: [1.2] array of arrays: [1[2.3], equivalent to [1.2.3] object array: [{"name": "Mary"."age": 12 }, { "name": "John"."age": 10 }]
Copy the code
  • Object: the Object
    • JSON documents are hierarchical in nature: the document contains internal objects, and the internal objects themselves contain internal objects.
{
    "region": "US"."manager.age": 30."manager.name ": "John Smith"} Index method is as follows: {"mappings": {
    "properties": {
        "region": { "type": "keyword" },
        "manager": {
            "properties": {
                    "age": { "type": "integer" },
                    "name": { "type": "text" }
                }
            }
        }
    }
}
Copy the code

If an object type is stored in the index library, such as girl above, girl is programmed with two fields: girl.name and girl.age

  • The IP address
PUT my_index
{
    "mappings": {
        "_doc": {
            "properties": {
                "ip_addr": {
                    "type": "ip"
                }
            }
        }
    }
}

PUT my_index/_doc/1
{
    "ip_addr": "192.168.1.1"
}

GET my_index/_search
{
    "query": {
        "term": {
            "ip_addr": "192.168.0.0/16"}}}Copy the code

index

Index Affects the index of a field.

  • True: the field is indexed and can be used for search filtering. The default value is true, and the ES retrieval is conditional only if the index value of a field is set to true.
  • False: The field is not indexed and cannot be searched

The default value of index is true, which means that all fields are indexed without any configuration.

However, some fields that we do not want to be indexed, such as the image information (URL) of the item, need to set index to false manually.

store

Whether to store additional data.

When we learned about Lucene, we learned that if the store of a field is set to false, the value of that field will not show up in the document list and will not show up in the user’s search results.

However, in Elasticsearch, results can be found even if store is set to false.

The reason for this is that when Elasticsearch creates a document index, it backs up the original data in the document into a property called _source. And we can filter _source to choose what to display and what not to display.

If we set store to true, we store an extra copy of data in addition to _source, which is unnecessary, so we usually set store to false. In fact, the default value of store is false.

In some cases, this might make sense for a particular area of the Store. For example, if your document contains a title, a date, and a very large content field, you might just want to retrieve the title and the date without having to retrieve them from a large _source

Extract these fields from:

PUT my_index
{
    "mappings": {
        "_doc": {
            "properties": {
                "title": {
                    "type": "text"."store": true
                },
                "date": {
                    "type": "date"."store": true
                },
                "content": {
                    "type": "text"
                }
            }
        }
    }
}
Copy the code

boost

Website weight: website weight refers to the search engine to the website (including web pages) given a certain authority value, the evaluation of the website (including web pages) authority evaluation. The higher the weight of a website, the greater the weight of the search engine, the better the search engine ranking. Improve the weight of the site, not only conducive to the site (including web pages) in the search engine ranking more, but also improve the flow of the whole station, improve the site trust. So improve the weight of the site has quite important significance. Weight is the importance of the site in SEO, authority. English: Page Strength. 1, weight is not equal to ranking 2, weight has a very big impact on ranking 3, the weight of the whole station is conducive to the ranking of the inside page.

Weight: When adding data, you can specify the weight of the data. The higher the weight, the higher the score and the higher the ranking.

PUT my_index
{
    "mappings": {
        "_doc": {
            "properties": {
                "title": {
                    "type": "text"."boost": 2
                },
                "content": {
                    "type": "text"
                }
            }
        }
    }
}
Copy the code

Matches on the title field have twice the weight of matches on the field content, with a default boost value of 1.0.

Promotion only applies to Term queries (do not promote Prefifix, range, and fuzzy queries).

Create index libraries and types at once

Step 2: PUT lagou/_mapping/goods {"properties": {
        "title": {
            "type": "text"."analyzer": "ik_max_word"
        },
        "images": {
            "type": "keyword"."index": "false"
        },
        "price": {
            "type": "float"}}}Copy the code

Create index library (type); create index library (type);

Put/index library name {"settings": {"Index library property name":"Index library property values"
    },
    "mappings": {"Type name": {"properties": {"Field name": {"Mapping attribute Name":"Mapping attribute value"
                }
            }
        }
    }
}
Copy the code

Try it:

PUT /lagou2
{
    "settings": {},
    "mappings": {
        "goods": {
            "properties": {
                "title": {
                    "type": "text"."analyzer": "ik_max_word"
                }
            }
        }
    }
}
Copy the code

Results:

{
    "acknowledged": true."shards_acknowledged": true."index": "lagou2"
}
Copy the code