For public IP, tables can be created to specify which city the IP belongs to a particular range. But a big part of the Internet is different. There are corporate private networks in every country in the world with IP addresses in the format of 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16. These IP addresses often have no real information about geographical location. Therefore, the geoIP filters/handlers built into Elasticsearch and Logstash do not work with these private ips.

Elasticsearch and Logstash can optionally specify a specific database file (database/database_file) to use, so in theory you can customize the build. However, this can be time-consuming and costly to maintain, not to mention you may have to learn a new set of tools to build just those.mmdb files (a future topic).

A simpler approach can be implemented using another tool already built into Elasticsearch (Enrich Processor), which can (surprisingly) enrich our documents with any kind of data we want, including geographic data. Let’s look at an example of doing this.

 

Example: Enrich private IP by continent, country, and city

In this case, I will only talk about cities in arbitrarily chosen countries/regions, but it is easy to see how this can be refined down to the level of villages or even specific buildings, or just countries or regions. This depends only on the specific use case requirements.

Let’s start by creating an index for private_geoips:

PUT private_geoips 
{ 
  "mappings": { 
    "properties": { 
      "city_name": { 
        "type": "keyword" 
      }, 
      "continent_name": { 
        "type": "keyword" 
      }, 
      "country_iso_code": { 
        "type": "keyword" 
      }, 
      "country_name": { 
        "type": "keyword" 
      }, 
      "location": { 
        "type": "geo_point" 
      }, 
      "source.ip": { 
        "type": "ip" 
      } 
    } 
  } 
}
Copy the code

To do this, we need to control which private IP addresses are used in which locations in the office. Given the nature of private networks, there is always the opportunity for overlap. For this small example, we will imagine that we can control some subnets of different offices, which I assign as follows:

10.5.0.0/16 – Toronto Canada (Toronto, Canada) 10.6.0.0/16 – Berlin Germany 10.7.0.0/16 – Tokyo Japan

Next we import the following data into index private_geoips:

POST private_geoips/_bulk {"index":{"_id":"pretoria-south-africa"}} {"city_name":"Pretoria","continent_name":"Africa","country_iso_code":"SA","country_name":"South "Influenza", "location" : [28.21837, 25.73134], "the source, IP" : [" 10.4.54.6 ", "10.4.54.7", "10.4.54.8"]} {"index":{"_id":"toronto-canada"}} {"city_name":"Toronto","continent_name":"North America, "" country_iso_code" : "CA", "country_name" : "Canada", "location" : [79.347015, 43.65107], "the source, IP" : [" 10.5.231.89 ", "10 . 5.231.90 10.5.231.91 ", ""]} {" index" : {" _id ":" Berlin - Germany "}} {" city_name ":" (", "continent_name" : "Europe", "country_iso_code" : "DE", "location" : [13.404954, 52.520008], "country_name" : "Germany", "region_name" : ""," source. IP ": [" 10.6.132.43", "10.6.132.44"]} {" index ": {" _id" : "Tokyo - Japan"}} {city_name ""," Tokyo ", "continent_name" : "Asia", "country_iso_code" : "JP", "country_name" : "Japan" and "location" : [139.839478, 35.65 2832], "the source IP:" [] "10.7.1.76"}Copy the code

All documents that we store in the private_GEOIPS index must have all of the fields mentioned in the POLICY enrich_fields array.

PUT _enrich/policy/private_geoips_policy
{
  "match": {
    "indices": "private_geoips",
    "match_field": "source.ip",
    "enrich_fields": [
      "city_name",
      "continent_name",
      "country_iso_code",
      "country_name",
      "location"
    ]
  }
}
Copy the code

For those of you who are not familiar with the Enrich Processor, see my previous article “Elasticsearch: How to Enrich logs and metrics with Elasticsearch ingest nodes.” When the source. IP field of a document is successfully compared to one of the source. IP fields in the private_geoips index, The other fields “city_name”, “continent_name”, “country_ISO_code “, “country_name”, “location” in private_geoips will be enriched into the document.

When we first create a policy or change an index, we need to re-execute the policy.

POST /_enrich/policy/private_geoips_policy/_execute
Copy the code

Let’s create our ingest pipeline:

PUT /_ingest/pipeline/private_geoips 
{ 
  "description": "_description", 
  "processors": [ 
    { 
      "dot_expander": { 
        "field": "source.ip" 
      } 
    }, 
    { 
      "enrich": { 
        "policy_name": "private_geoips_policy", 
        "field": "source.ip", 
        "target_field": "geo", 
        "max_matches": "1" 
      } 
    }, 
    { 
      "script": { 
        "lang": "painless", 
        "source": "ctx.geo.remove('source.ip')" 
      } 
    } 
  ] 
}
Copy the code

Finally, let’s test ingest pipeline:

POST /_ingest/pipeline/private_geoips/ _SIMULATE {"docs": [{"_source": {"source. IP ": "10.7.1.76"}}, {"_source": {"source. IP ": "10.4.54.7"}}]}Copy the code

Running the above test, we can see the following results:

{ "docs" : [ { "doc" : { "_index" : "_index", "_type" : "_doc", "_id" : "_id", "_source" : { "geo" : { "continent_name" : "Asia", "country_name" : "Japan", "city_name" : "Tokyo", "location" : [139.839478, 35.652832], "country_iso_code" : "JP"}, "source" : {" IP ":" 10.7.1.76 "}}, "_ingest" : {" timestamp ":" the 2020-09-16 T01: when. 901889 z "}}}, {" doc ": {" _index" : "_index", "_type" : "_doc", "_id" : "_id", "_source" : { "geo" : { "continent_name" : "Africa", "country_name" : "South Africa", "city_name" : "Pretoria", "location" : [28.21837, -25.73134], "COUNTRY_ISO_code" : "SA"}, "source" : {" IP ": "10.4.54.7}}", "_ingest" : {" timestamp ":" the 2020-09-16 T01: when. 901897 z "}}}}]Copy the code

That’s it! Your private IP address has now enriched the geo-location data stored in the private_geoIPS index. Now, all you need to do is maintain the lookup index.