Delving into Elastic Search: ES Performance Optimization Summary

Foreword: Embrace the new version

ElasticSearch is a very cheap way to improve performance and coding experience by increasing performance by 10% to 30% in a given scenario. If you can’t guarantee the update of the small stable version, then at least ensure the update of the big version, try not to upgrade with the official difference of a big version, I am experiencing such a thing, across three big versions of the upgrade is really very painful!

Query optimization

SSD

The benefits of SSD go without saying, but it’s probably not up to us to decide, just know that.

Suitable memory

ElasticSearch is a very, very memory hungry search engine, sort (sort), AGG (aggregate), fieldAdd (word segmentation), inverted index (inverted index), etc., are constantly consuming memory, and there must be enough JVM HEAP to maintain this balance.

A machine with 64 GIGABytes of memory is ideal, and the JVM HEAP should be set to 1/2 of the machine (32 GIGABytes) to ensure sufficient HEAP space and sufficient off-heap memory, since Lucene is responsible for full text retrieval, and Lucene’s performance depends on the interaction with the operating system. If you allocate all memory to Elasticsearch’s heap, there will be no memory left for Lucene. This severely affects full-text retrieval performance, so running out of out-of-heap memory often happens in OOM.

The standard recommendation is to use 50% of the available memory as Elasticsearch heap memory and keep the remaining 50%. Of course it won’t go to waste, Lucene will be happy to use the remaining memory.

Also watch out for ElasticSearch’s 32GB memory limit! To this day, the website doesn’t recommend more than 32GB

Heap memory: size and exchange | Elasticsearch: authoritative guide | Elastic

The problem is caused by the bottleneck of compressed memory (COMPRESSED oops) technology, which is more than 32GB, resulting in pointer compression failure and memory waste. It is said that the performance of 50GB heap is similar to 31GB and the garbage collection pressure is high.

-Xms 31g
-Xmx 31g
Copy the code

If you have 128 machines, you can set up two nodes, each allocated 32 GB of memory to form a cluster. If you are in this situation (a physical machine (hard disk) has a memory bottleneck with multiple nodes open), be sure to set this up

cluster.routing.allocation.same_shard.host: true
Copy the code

This prevents the master copy of the same shard from being on the same physical machine (because the high availability of the copy is lost if it is on the same machine)

_routing routing

ElasticSearch *routing (data routing) is easy to use. It can be used in scenarios like ShadingSpare. * Routing can be used to locate shards. In general, the more uniform the better. The author is working on SAAS platform, so you can use (company identity) to do _routing. If it is a log system, you can consider using (time granularity) as _routing. The query speed is greatly improved after the data volume increases _routing.

_routing ElasticSearch internal flow:

Distribution: After the request arrives at the coordination node, the coordination node distributes the query request to each shard.
Aggregation: The coordination node collects the query results on each shard, sorts the query results, and then returns the results to the user.

In the case of large data, _routing must be added for each query of any data. In special service scenarios, if no _routing is required, unified _routing can be used.

Do not split fields that do not involve a search

There are many business fields that do not need a word segmentation at all, such as the association ID, some business fields that do not participate in the search, status code, etc., such fields must be declared as non-word segmentation:

index:"no_analyzer"Or type:"keyword"
Copy the code

Full matching, so that this field does not participate in word segmentation, does not participate in inverted index, reduce segment to save memory space, let memory space cache more useful data, memory is limited, to cherish the use of

Check the segment memory usage of the index

GET _cat/segments/indexName? vCopy the code

Size: segment Disk space occupied

Memory. size: The memory space occupied by the segment

The index split

** First stop creating multiple types under one index! ** Even if you are a 2.x, 5.x version that has not yet banned multiple types, you need to dismantle it as soon as possible, considering future version upgrades and subsequent expansibility, and I learned from a lot of reference to the official website: Creating multiple types of an Index will cause data skew, which is similar to the skew of a hash ring. This will lead to uneven data distribution and poor compression performance of the document! The dense distribution, more in line with Lucene specifications, more suitable for compression. Each index defaults to a _doc.

Secondly, if the log system is similar to ELK and ELFK, rollover + index template can be used to scroll the index according to the time, to avoid a huge amount of data generated by an index and fixed shard, resulting in the disaster of search. An index of 10 billion doc set 5 shards, can ensure not waste fragments and ensure real-time.

Deep paging is not allowed

This is how I convince the product manager. The default of ElasticSearch is to query the first 10000 most match data, if you want to go further (e.g. : Report calculation requires a full amount of data), which may require maxResult or Scroll for one-time reading or page-turning query.

Disable wildcard

So far we still have the wildcard bug:

{
		    "bool": {
		      "must": [{"wildcard": {
		            "name": "${name}"}}, {"term": {
		            "companyId": "${companyId}"}}]}}Copy the code

Posting an example: discuss. Elastic. Co/t/wildcard -…

Wildcard is a performance killer. In the case of a large amount of data, it will query very slowly, and it is easy to be stuck, or even lead to the crash and breakdown of cluster nodes.

Alternative: Standard is combined with IK, retrieved using match_PHRASE. NGram custom word splitter, use match_PHRASE search, or explore a less precise search method? Search by match_PHRASE and SLOP.

In order to satisfy wildcard queries, it must traverse all items, which results in significant processing overhead. It works, but it’s definitely not working.

Allow_expensive_queries = false To disable wildcard.

Data preheating

If the segment index is not already in memory, it will need to be cached from disk. This process is much slower than querying the segment index directly. However, after a period of time, the memory may be replaced by another index segment, so the next search will need such a long process:

Because ElasticSearch has limited memory, we can imagine that the data in its memory is like a fixed size space managed by the LRU algorithm, and we can find a way to trigger a hot data search every once in a while, so that it is always in the ES memory (JVM HEAP).

To optimize

Rollover is the best way to log

Instead of logging the second swallow, you can use the Rollover + index template to scroll the index over time, avoiding the disaster of having an index and a fixed shard generate a huge amount of data, resulting in a search. An index of 10 billion doc set 5 shards, can ensure not waste fragments and ensure real-time.

And based on ELK, ELFK log system, it is more simple, has provided us with good rules, configuration can be.

Reduce the number of times you Refresh

ES is a near-real-time engine, not real-time. The default is 1S, because Lucene writes data to memory first. After one second (the default), Lucene triggers a Refresh, which flushers the data in memory to the operating system’s file cache.

If the real-time performance of search is not very high in some scenarios, such as reports and logs generated at midnight, you can increase the refresh default event for 1S to reduce the number of Segment Merge events.

index.refresh_interval:20s
Copy the code

Each directory mounts a different disk

This scenario can be reflected in the above 32GB memory bottleneck. The 128GB server has two nodes in the data directory, so we split the two directories and mount them to different disks (must be different physical disks, not partitions……). This is equivalent to RAID0, can greatly improve write speed, of course, not unlimited or there is a network card bandwidth bottleneck, if the cloud service is mechanical hard disk this makes sense, SSD is not necessary.

The prerequisite is one machine, different nodes, and different hard disks

This method applies to other distributed storage systems.

Github launch archive: github.com/pkwenda/Blo…