Abstract:

Data migration for Elasticsearch is something that engineers often do, sometimes for cluster migration, sometimes for data backup, sometimes for upgrades, etc. For example, elasticSearch-dump, snapshot, or even reindex. Elasticsearch cluster migration using A Logstash solution

Elasticsearch Cluster is a Cluster of elasticSearh nodes with a logstash file. This is a Cluster of elasticSearh nodes with a logstash file.

Create a Logstash CONF file for data synchronization in the logstash directory

Vim. / logstash - 5.5.3 / es - es. ConfCopy the code

Conf file, because we only need to do index relocation, so the target Cluster and source Cluster index name is the same.

input {
    elasticsearch {
        hosts => ["********your host**********"]
        user => "* * * * * * *"
        password => "* * * * * * * * *"
        index => "Logstash - 2017.11.07"
        size => 1000
        scroll => "1m"}}# This section is commented to indicate that filter is optional
filter {
}
output {
    elasticsearch {
        hosts => ["***********your host**************"]
        user => "* * * * * * * *"
        password => "* * * * * * * * * *"
        index => "Logstash - 2017.11.07"}}Copy the code

After the conf file is configured, run the logstash command

bin/logstash -f es-es.confCopy the code

When executing this command, you sometimes get the following error message

[FATAL][logstash.runner] Logstash could not be started because there is already another instance using the configured data directory.  If you wish to run multiple instances, you must change the "path.data" setting.Copy the code

This is because the current Logstash version does not support multiple instances sharing the same path.data, so you need to add “–path.data path “in the command line at startup to specify different paths for different instances

bin/logstash -f es-es.conf --path.data ./logs/Copy the code

If this goes well, execute the following command to see the corresponding index in the target’s ElasticSearch

curl -u username:password host:port/_cat/indicesCopy the code

Here’s a practical scenario for migrating elasticSearch indexes using logstash:

** Elasticsearch is a product that many Elasticsearch customers have recently noticed. How to migrate your own data to Ali Cloud Elasticsearch? Here’s how to quickly move index data from Elasticsearch to the cloud using LogStash. **

The logic of this solution is simple. Unpacking is to configure N es-to-es conf files, but doing so is tedious. In fact, logstash provides the ability to do this in batches. To do this, three important concepts need to be introduced:

  • Metadata: LogStash version 1.5 uses the concept of metadata to describe an event. Metadata can be modified by users, but it does not affect the result of the event. In addition, metadata, as the metadata description of events, can survive the entire execution cycle of input, Filter, and Output plug-ins.

Make Your Config Cleaner and Your Log Processing Faster with Logstash Metadata

  • Docinfo: Elasticsearch Input (false by default) If set, include Elasticsearch document information such as index, type, “And the ID in the event.” means that if this field is set to take effect, all information such as index, type and ID will be recorded in the event, that is, metadata. This means that the event execution cycle can be completed within the event execution cycle. The user can use the index, type, and id parameters at will;
  • * * * * * * * * * * * * * * * * * * * *

Metadata can “inherit” the index and type information in the output and create the same index and type (or even id) in the target Cluster as in the source Cluster.

If you want to see the metadata and perform debug-like operations on it, you need to add a configuration to output:

stdout { codec => rubydebug { metadata => true}}Copy the code

The sample configuration code is as follows:

input {
    elasticsearch {
        hosts => ["yourhost"]
        user => "* * * * * * * * * *"
        password => "* * * * * * * * *"
        index => "*"# This wildcard indicates that all index information needs to be read
        size => 1000
        scroll => "1m"
        codec => "json"
        docinfo => true}}# This section is commented to indicate that filter is optional
filter {
}

output {
    elasticsearch {
        hosts => ["yourhost"]
        user => "* * * * * * * *"
        password => "* * * * * * * *"
        index => "%{[@metadata][_index]}"

    }
    stdout { codec => rubydebug { metadata => true}}}Copy the code

After this command is executed, the Logstash file copies all the indexes in the source Cluster to the target Cluster, carries the mapping information to the target Cluster, and then migrates data in the index.

Suggestion: When formally implementing

stdout { codec => rubydebug { metadata => true}}Copy the code

You are advised to delete this configuration item. Otherwise, the screen will be flooded with metadata information.