This is the 6th day of my participation in the wenwen Challenge

With the growth of service data, the pressure of Mysql is increasing. Single table storage will seriously affect the efficiency of read and write. To solve this problem, solutions such as table and library partition and historical data migration have emerged, but there are still problems in some scenarios, such as aggregated query. As a real-time distributed search and analysis engine, ElasticSearch can quickly query and process massive data, and is highly available and scalable. With ES, it can easily implement aggregated query to ensure query efficiency, which has been successfully used by many large Internet enterprises.

After using ES for a period of time, it is found that the real-time insertion of data has a relatively large consumption of network bandwidth, and the high concurrency has a great pressure on the server. Therefore, can we periodically query the SQL database to maintain data synchronization to decouple this problem?

After investigation, there are basically the following tools to achieve synchronization with Mysql:

  • Logstash -input- JDBC: the official recommended plug-in, written in Ruby
  • Go-mysql-elastic: Written by Go, author in China
  • Python – mysql replication: python
  • Elasticsearch – JDBC: written in Java
  • Elasticsearch -river-mysql: very little maintenance already

0. Install ElasticSearch

The ES installation process is not covered in this article.

1. Download the Logstash

Take version 6.1.1 as an example:

Wget https://artifacts.elastic.co/downloads/logstash/logstash-6.1.1.zip # waiting for download, performed decompression unzip logstash - 6.1.1. ZipCopy the code

2. Install the logstash-input- JDBC plug-in

Go to the decompressed directory CD logstash-6.1.1

Run sudo bin/logstash-plugin install logstash-input- JDBC as user root

Add-on: logstash5.x is a plugin that comes with logstash5.x, but I have installed it, so I will not verify it.

The installation process will be a little slow. I succeeded locally, and the server is stuck with the Installing Logstash -input- JDBC. The next morning, I tried again and succeeded.

CD logstash - 6.1.1 wget unzip at https://github.com/logstash-plugins/logstash-input-jdbc/archive/v4.3.1.zip Logstash - input - JDBC - this zip - 4.3.1 / CD logstash - input - JDBCCopy the code

To perform:

vi Gemfile
Copy the code

Change source to: “https://ruby.taobao.org” or “https://gems.ruby-china.org”

Continue after saving:

vi logstash-input-jdbc.gemspec
Copy the code

Find this line:

s.files = `git ls-files`.split($\)
Copy the code

To:

s.files = [".gitignore", "CHANGELOG.md", "Gemfile", "LICENSE", "NOTICE.TXT", "README.md", "Rakefile", "lib/logstash/inputs/jdbc.rb", "lib/logstash/plugin_mixins/jdbc.rb", "logstash-input-jdbc.gemspec", "spec/inputs/jdbc_spec.rb"]
Copy the code

To continue:

gem build logstash-input-jdbc.gemspec cd .. Bin/plugin install logstash - input - JDBC - this gemCopy the code

This method is manually installed and requires you to specify the version of the plug-in. Refer to the plugins-Elsies-JDBC documentation to check the latest version of the plug-in.

3. Associate the Mysql database

Download the mysql-connector-java-6.0.6 jar package for mysql connection

Go to logstash and create a conf file.

CD logstash 6.1.1 / config vim JDBC. ConfCopy the code

Edit and paste the following:

Input {JDBC {# Jar package path jdbc_driver_library =>"/opt/mysql-connector-java-6.0.6.jar" jdbc_driver_class = > "com. Mysql. JDBC Driver" # database address jdbc_connection_string = > "JDBC: mysql: / / 192.168.0.1:3306 / myTest? CharacterEncoding = UTF-8&usessL =false" jdbc_user =>"root" jdbc_password =>"123456" # SQL statement executed, #statement_filepath =>" /opt/test. SQL "statement =>"SELECT * FROM tname"}} output {elasticSearch Document_type => document_type document_type => document_type document_type Document_id =>"%{ID}" template_overwrite => true hosts =>["127.0.0.1:9200"]}}Copy the code

To execute SQL statements in filepath format, you need to create an additional SQL file and write SQL statements in it, which is omitted here.

Return to the logstash-6.1.1 directory and execute:

bin/logstash -f config/jdbc.conf
Copy the code

Mysql > alter table SQL > alter table SQL > alter table SQL > alter table SQL > alter table SQL > alter table SQL > alter table SQL It is important to note that the update operation is currently not implemented by plug-ins.

To view the synchronized data, visit http:// server IP address :9200/name_index/name_type/_search.

Finally, the process is mounted in the background so that it has to do periodic query.

4. Operation data

Select the url of elasticSearch. url from kibana, download it to the local directory, decompress it, and enter config/kibana.yml. Run the bin/kibana browser to visit http://localhost:5601, and you can use the ES server on the local connection line. Common functions such as Dev Tools.

Elasticsearch: The definitive guide

5. Query multiple tables

Input {JDBC {jdbc_driver_library =>"/opt/mysql-connector-java-6.0.6.jar" jdbC_driver_class =>" com.mysql.jdbc.driver" jdbc_connection_string =>"jdbc::3306/? characterEncoding=UTF-8&useSSL=false" jdbc_user =>"" jdbc_password =>"" schedule =>"* * * * *" statement_filepath => "/opt/order_goods.sql" #statement =>"SELECT * FROM tname" type => "jdbc_order_goods" } jdbc { jdbc_driver_library = > "/ opt/mysql connector - Java - 6.0.6. Jar" jdbc_driver_class = > ". Com. Mysql. JDBC Driver "jdbc_connection_string =>"jdbc::3306/? characterEncoding=UTF-8&useSSL=false" jdbc_user =>"" jdbc_password =>"" schedule =>"* * * * *" statement_filepath => "/opt/user.sql" type => "jdbc_user" } } output { if[type] == "jdbc_order_goods"{ elasticsearch { hosts =>["127.0.0.1:9200"] index =>"order_goods_index" document_type =>"order_goods_type" document_id =>"%{id}"}} if[type] == "jdbc_user"{elasticSearch {hosts =>["127.0.0.1:9200"] index => "user_index" document_type => "user_type" document_id => "%{id}" } } }Copy the code