Null cannot be indexed or searched. When a field is set to NULL (or an empty array or all arrays with null values), it is treated as if the field has no value. The NULl_value argument replaces an explicit null value with a specified value so that it can be indexed and searched.

 

Example a

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "status_code": {
        "type":       "keyword",
        "null_value": "NULL" 
      }
    }
  }
}
Copy the code

Above, we define its NULl_value for the status_code field. That is, when we import a document, if its status_code is specified as NULL, then when we import, It is actually considered status_code as “NULL” and is imported and parsed. Let’s illustrate the following two documents:

PUT my-index-000001/_doc/1
{
  "status_code": null
}

PUT my-index-000001/_doc/2
{
  "status_code": [] 
}
Copy the code

Write two documents to index my-index-000001 using the two commands above. We perform the following search:

GET my-index-000001/_search
Copy the code

We can search for two documents:

{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 2, the "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ":" my - index - 000001 ", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : {" status_code ": null}}, {" _index" : "my - index - 000001", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : {" status_code ": []}}}}]Copy the code

This is obviously true because we imported two documents. We then perform the following search:

GET my-index-000001/_search
{
  "query": {
    "term": {
      "status_code": "NULL" 
    }
  }
}
Copy the code

The command above shows the result:

{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 1, the "base" : "eq"}, "max_score" : 0.2876821, "hits" : [{" _index ": "My - index - 000001", "_type" : "_doc", "_id" : "1", "_score" : 0.2876821, "_source" : {" status_code ": null}}}}]Copy the code

Obviously, the first document is searched, but the second document is not. This is because in the first document, it clearly states “status_code”: null, so null_value is imported and analyzed as status_code when the document is imported. The second document does not specify that it is NULL, so it is not searched.

 

Example 2

Suppose we have the following two documents:

PUT twitter/_doc/1
{
  "age": null
}

PUT twitter/_doc/2
{
  "age": 20
}
Copy the code

On top, we have two documents. The first document has an age value of null, which means it cannot be searched. Suppose we do the following aggregation:

GET twitter/_search
{
  "size": 0,
  "aggs": {
    "avg_age": {
      "avg": {
        "field": "age"
      }
    }
  }
}
Copy the code

So the above aggregation returns:

{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "avg_age" : {"value" : 20.0}}}Copy the code

That means the average age is 20, because the first document is considered missing. How can we get the first document to participate in the aggregation? We can use null_value to set a value for a null field. Let’s change the mapping to:

DELETE  twitter

PUT twitter
{
  "mappings": {
    "properties": {
      "age": {
        "type": "float",
        "null_value": 0
      }
    }
  }
}
Copy the code

Let’s re-import the previous two documents:

PUT twitter/_doc/1
{
  "age": null
}

PUT twitter/_doc/2
{
  "age": 20
}
Copy the code

Since we’ve defined null_value to take effect when age is null, its value will be 0, and the document will be visible. Perform the following aggregation:

GET twitter/_search
{
  "size": 0,
  "aggs": {
    "avg_age": {
      "avg": {
        "field": "age"
      }
    }
  }
}
Copy the code

The result above is:

{ "took" : 703, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : {"avg_age" : {"value" : 10.0}}}Copy the code

Now the average is 10, which is 20 plus 0 over 2 is 10.

It is important to note that we must explicitly specify age as null, otherwise NULl_vale will have no effect. Such as:

DELETE twitter

PUT twitter
{
  "mappings": {
    "properties": {
      "age": {
        "type": "float",
        "null_value": 0
      }
    }
  }
}

PUT twitter/_doc/1
{
  "content": "This is cool"
}

PUT twitter/_doc/2
{
  "age": 20,
  "content": "This is cool too!"
}
Copy the code

Above, null_value will not work if age is not defined in the first document. If we do the following aggregation:

GET twitter/_search
{
  "size": 0,
  "aggs": {
    "avg_age": {
      "avg": {
        "field": "age"
      }
    }
  }
}
Copy the code

The display result is:

{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "avg_age" : {"value" : 20.0}}}Copy the code

That means the first document is not searched.