Elasticsearch Document, index, and REST API

This is from the geek Time Elasticsearch Core Technology Practice course study notes, interested students can subscribe to the course

The basic concept

The document

Elasticsearch is document-oriented, and a document is the smallest unit of all searchable data

Such as:
- Log entries in log files
- Specific information in a movie
- Details of a song/A PDF document
The document is serialized as JSON and saved in Elasticsearch
- A JSON object consists of fields
- Each field has a corresponding field type (string/number/Boolean/date/binary/range type)
Each document has a Unique ID
- You can specify your own ID
- Or automatically generated by Elasticsearch

Json document

A document contains a series of fields, similar to a record in a database
JSON document, flexible format, do not need to define the format
- The type of the field can be specified or automatically calculated by Elasticsearch
- Support data/support nesting

movieId,title,genres
1,Toy Story(1995),AdvenTure|Animation|Children|Comedy|Fantasy
Copy the code

CSV file is converted to JSON through ES

{
    "year" : 1995."@version" : 1."genres" : [
        "AdvenTure"."Animation"."Children"."Comedy"."Fantasy"]."id" : "1"."title" : "Tony Story"
}
Copy the code

The metadata of the document

{
    "_index" : "movies"."_type" : "_doc"."_id" : "1"."_score" : "14.626"."_source" : {
    	"year" : 1995."@version" : 1."genres" : [
        	"AdvenTure"."Animation"."Children"."Comedy"."Fantasy"]."id" : "1"."title" : "Tony Story"}}Copy the code

Metadata, which is used to annotate information about a document
- _index: indicates the name of the index to which the document belongs
- _type: the name of the type to which the document belongs
- _ID: unique ID of a document
- _source: Raw Json data for the document
- _all: consolidate all fields into this field (deprecated)
- _version: indicates the version of the document
- _score: Correlation score

The index

{
    "movies" : {
        "settings" : {
            "index" : {
                "create_date" : "15526261177"."number_of_shards" : "2"."number_of_replicas" : "0"."uuid" : ""."verison" : {
                    "created" : "302302"
                },
                "provided_name" : "movies"}}}}Copy the code

Index- An Index is a container of documents, a combination of documents
- Indexes embody the concept of logical space: Each Index has its own Mapping definition that defines the field names and field types of the contained documents
- The Shard draws on the concept of physical space: the data in the index is spread across the Shard
Mapping and Settings for the index
- Mapping Defines the types of document fields
- Setting Defines different data distributions

Different meanings of indexes

You can create many different indexes in an Elasticsearch cluster
Saving a document to Elasticsearch is called indexing
- Es, the process of creating an inverted index
A B-tree index and an inverted index

type

Before 7.0, multiple types could be set for an index
After 7.0, only one Type-“_doc” can be created for an index

Abstraction and Analogy

RDBMS	Elasticsearch
Table	Index(Type)
Row	Document
Column	Field
Schema	Mapping
SQL	DSL

Before 1.7.0, multiple types could be set for an index

2. Type has been Deprecated. As of 7.0, only one Type-“_doc” can be created for an index.

3. Differences between traditional relational database and Elasticsearch

Elasticsearch – Schemaless/Correlation/High performance full text search
RDMS – transactional /Join

Rest API- Easily called by various languages

Some basic apis

Indices
- Create Index
  - PUT movies
- View all indexes
  - _cat/indices

The original address

Cbaj. Gitee. IO/blog / 2020/0…