At the end of the year job-hopping pay technology which strong, but also have to learn Java and beautiful boy!

Near the end of the New Year, as a programmer, I cannot do without the topic of job-hopping and salary increase. Recently, I am also responsible for building the architecture of ElasticSearch as Hbase secondary index in the company. I’m sharing ElasticSearch as a series.

What is elastic search

Wondering why it’s called elastic search. ElasticSearch is the name of ElasticSearch for the first time. ElasticSearch is the name of ElasticSearch for the first time. ElasticSearch is a distributed, RESTful search and analytics engine. ElasticSearch is usually labeled as: full text search, inverted index, search engine.

Take the first step in MySQL

MySQL as we contact the most, the most familiar database, his importance is beyond doubt. Maybe you’re having some problems with MySQL. Anyway, try ElasticSearch to start doing some data stores instead of MySQL.

MySQL > select * from ‘MySQL’;

select * from tableName where name like "%XX%";
Copy the code

You will also often see this advice: don’t write % to the left of MySQL fuzzy queries, this will cause index invalidation! There is nothing wrong with this statement, but what if the product little sister gives the requirement to be a fuzzy query? Dare you tell him out loud that the demand is unreasonable and the index will fail? I don’t think you’d say no to a demand like that, so what are some of the solutions that come to mind?

  1. fate

May finish the function, the use of feeling without any problems, what index failure does not fail, I do not feel, query speed did not feel reduced. So I’m going to go out on a limb and guess that the amount of data in the table is probably in the tens of thousands. Data volume growth is also very low. Such a situation to consider the performance is really not too humane, first consider whether the survival of the product is king!

  1. Disable fuzzy search

When the project runs for a period of time, the amount of data up, slow SQL log a day run non-stop, then find the reason is caused by fuzzy search, after discussion, cut ~!! When you can’t solve it with technology, cutting out some features to ensure usability is an option

  1. Optimize it the way MySQL does

See many optimization examples on the Internet, here we directly use MySQL to do verification, see the final optimization effect! First use Java code to build 50 million, build code has been included in MySQL build

Use the following two SQL statements to query in MySQL, and the results of the query are not surprising

Select * from emp where age > 15 and user_name like "%" LIMIT 100; Select * from emp where age > 15 and user_name like "% sho %" LIMIT 100;Copy the code

As we all know, MySQL Like fuzzy query, when % is to the left of fuzzy match, can not use the index, will perform full table search, so inefficient, while the first statement does not have % left, can use the index.

Q: If you use user_name as the condition, the efficiency is actually acceptable, you smart, know why?

MySQL provides full text search support for InnoDB in 5.6. Full text search is a kind of index, called inverted index, which will be covered by ElasticSearch.

The following test is a fuzzy query in different cases through the email field.

select * from emp where email like "5291%" and age = 15 ;
# fulltext grammar
select * from emp where MATCH(email) AGAINST("5921 *" IN BOOLEAN MODE) and age = 15
Copy the code
The index Elapsed time
There is no index 383 seconds
Btree index 588 seconds
Fulltext indexing 89 seconds
Fulltext is a bit better at optimizing fuzzy queries, but 89 seconds is still too long for the business to accept.
# # # ElasticSearch queries
Install ElasticSearch (ES7.10.1) and select Cerebro for visualization. Select Datax to import data from ElasticSearch. If you do not know how to use Datax, go to the next stepDatax basic use.
If you use ElasticSearch to query data whose email address starts with 5921 and whose age is 15, it takes 491ms. If you use ElasticSearch to query data whose email address starts with 5921 and whose age is 15, it takes 491ms. You won’t believe it, MySQL is 400 + SEC slow, ElasticSearch is 400 + ms, a thousandfold difference!

Technology selection

There’s no doubt that MySQL is failing, but that doesn’t mean you should abandon MySQL and embrace ElasticSearch. MySQL is a relational database, which is good at dealing with structured data and has good support for associated query. After 5.6, it can basically deal with scenarios other than full-text retrieval under the condition that the data volume is not very large. ElasticSearch is NoSQL and stores unstructured data. Although it can be associated and aggregated, it is not recommended to use ElasticSearch in these scenarios.

MySQL ElasticSearh instructions
The real time real-time Near real time ElasticSearch has a lower write efficiency than MySQL, so MySQL is preferred if it requires high real-time performance
The data structure Fully structured data Unstructured data MySQL is preferred for structured data, but if there is also a search requirement, can you consider ElasticSearch
The transaction support Does not support ElasticSearch does not support transactions. There are strict transaction requirements for ElasticSearch
Query performance Performance deteriorates after data volume increases Supports large amounts of data ElasticSearch works better in search scenarios, and if there is a search scenario, don’t hesitate to select ElasticSearch

Write in the last

Regardless of your current data volume and scenario, I recommend you take advantage of ElasticSearch and you’ll find another world. You know, for search!