ELK Tips are Tips for using ELK from the Elastic Chinese community.

A, Logstash

1. Logstash performance tuning main parameters

  • pipeline.workers: Sets how many threads to start to execute fliter and output. Increase the size of this parameter when the input content is stacked and the CPU usage is sufficient.
  • pipeline.batch.size: Sets the maximum number of events that a single worker thread can collect before executing filters and outputs. Larger batch sizes are usually more efficient, but increase memory overhead. The output plug-in treats each batch as an output unit. ; For example, the ES output makes a batch request for each batch received; Adjust thepipeline.batch.sizeAdjust the size of Bulk requests sent to ES;
  • pipeline.batch.delay: Sets the Logstash pipeline latency. The pipeline batch latency is the maximum time (in milliseconds) that the Logstash pipeline waits for a new message after receiving an event in the current pipeline worker thread; In short, whenpipeline.batch.sizeWhen dissatisfied, they waitpipeline.batch.delayThe filter and output operations start after the timeout.

Second, the Elasticsearch

1. The difference between TermsQuery and multiple TermQueries

When terms are small, TermsQuery is equivalent to a ConstantScoreQuery containing multiple termqueries:

Query q1 = new TermInSetQuery(new Term("field"."foo"), new Term("field"."bar")); BooleanQuery bq = new BooleanQuery(); bq.add(new TermQuery(new Term("field"."foo")), Occur.SHOULD);
bq.add(new TermQuery(new Term("field"."bar")), Occur.SHOULD);
Query q2 = new ConstantScoreQuery(bq);
Copy the code

When terms is large, it combines matching documents into a bit set and scores on the bit set. The query is more efficient than a normal Bool merge.

TermsQuery is more efficient than multiple TermQuery combinations when terms are large.

ES uses nginx to configure domain names

Upstream /data/ {server 192.168.187.xxx:9200; keepalive 300 ; } server { listen 80; server_name testelk.xx.com; keepalive_timeout 120s 120s; location /data { proxy_pass http://data/; Proxy_http_version 1.1; proxy_set_header Connection"Keep-Alive";
        proxy_set_header Proxy-Connection "Keep-Alive";
        proxy_set_header X-Real-IP $remote_addr;
        proxy_pass_header remote_user 
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header Host $http_host;
        proxy_set_header X-Nginx-Proxy true; }}Copy the code

3. How do I stop writing to the ES Reindex service

Option 1: Kennywu76

ES reindex with real-time update/delete indexes cannot achieve true zero Down time even with alias.

Add a new document to the new index through the alias switch, while the old -> new index data transfer can be done; However, if the update/ DELETE operation is performed on a document that has not yet been transferred from the old index, operating directly on the new index will result in data inconsistency between the two indexes.

The solution I can come up with (untested) is if the database document has a field like last_update_time that records the last update time of the document, which is used as the version number of the ES document, and then when the data is written to the new index, The URL contains the following parameters :version_type=external_gt&version= XXXXXX.

Version_type =external_gt Indicates that the written document version number is greater than the existing document version number or does not exist. Otherwise, exceptions with version conflicts will be thrown. In addition, delete operations are converted to index operations, and the contents of index can be an empty document.

In this way, real-time data can be written to the new index and reindex at the same time. Real-time data should be of a higher version and always succeed. If reindex encounters a version conflict, it indicates that the document has been updated in real-time and is out of date.

Shortcomings of the scheme:

  • Data in the data source is required to have version information, which may be difficult to change due to various limitations;
  • The DELETE operation must be converted to writing an empty document, which is actually a tag document and has version information of its own. However, if segment merge occurs on the back end, the delete may be physically cleared after merging. In this way, delete and the corresponding version information are lost. If reindex is written to an older version of the document, there will still be consistency problems. An empty document can increase the size of the index file and cause additional overhead. One possible mitigation is to delete the empty document again after the reindex is complete.

Improvement plan: the_best

The steps to rebuild the index are as follows:

  1. Ensure that all delete operations are converted to index operations, which can be an empty document;
  2. The old indexold_index(business alias still hangs on the old index) perform reindex operation (version_type=external);
curl -X POST 'http://<hostname>:9200/_reindex'
{
    "conflicts": "proceed"."source": {
        "index": "old_index"."size": 1000}."dest": {
        "index": "new_index"."version_type": "external"}}Copy the code
  1. Cut the alias to newIndex;
  2. Will re-index the time periodold_indexHeat data generated. Get it againnew_index(conflicts=proceed&version_type=external);
curl -X POST /_reindex
{
    "conflicts": "proceed"."source": {
        "index": "old_index"
        "query": {
            "constant_score" : {
                "filter" : {
                    "range" : {
                        "data_update_time" : {
                            "gte": <reindex milliseconds before the start time timestamp >}}}}}},"dest": {
        "index": "new_index"."version_type": "external"}}Copy the code
  1. Delete an empty document manually.

This approach depends on the amount of data generated during reindexing (which affects the time taken in Step 4), but we can be flexible with our business situation. For example, it took us 10 hours to re-index with a large amount of data (more than 2 million new data were generated within 10 hours). Before cutting the alias, we can retrieve the data of nearly 10 hours into the new index according to the call method of step (4), and iterate several times until the alias is cut. We can guarantee that the last step (4) can be completed in a short time.

4. Configure ES node communication

Port: 9200 http.bind_host: 127.0.0.1 transport.tcp. Port: 9300 transport.bind_host: 127.0.0.1Copy the code

5. Pass Lucene’s native Query to ES

SearchRequest searchRequest = new SearchRequest(indexName);
searchRequest.types(typeName);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.from(0);
sourceBuilder.size(10);
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); //q is the Lucene search expression, directly enter the keyword matching _all or * field, field matching user:kimchy, User :kimchy AND message:Elasticsearch QueryStringQueryBuilder QueryStringQueryBuilder = QueryBuilders.queryStringQuery(q);sourceBuilder.query(queryStringQueryBuilder);
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest);
SearchHits searchHits = searchResponse.getHits();
Copy the code

6, ES document field limit

By default, the ES document does not allow the document field to exceed 1000. If the field exceeds 1000, the following error will be reported:

failed to put mappings on indices [[[nfvoemspm/srjL3cMMRUqa7DgOrYqX-A]]], type [log]
java.lang.IllegalArgumentException: Limit of total fields [1000] in index [xxx] has been exceeded
Copy the code

You can modify the index configuration to change the field limit, but optimizing for business purposes is recommended:

Modify Settings {"index.mapping.total_fields.limit": 2000}Copy the code

7. Convert DSL strings into QueryBuilder

# # wrapper
GET /_search
{
    "query" : {
        "wrapper": {
            "query" : "eyJ0ZXJtIiA6IHsgInVzZXIiIDogIktpbWNoeSIgfX0="}}}## RestClient
QueryBuilders.wrapperQuery("{\"term\": {\"field\":\"value\"}}")
Copy the code

8. After the ES cluster restarts, the Slice Scroll speed slows down

When you restart the machine, the Pagecache is gone, and all the data has to be reloaded from disk.

ES enables index creation and deletion logs

PUT _cluster/settings
{
  "persistent": {
    "logger.cluster.service": "DEBUG"}}Copy the code

10. Set the global level of slow logging

  1. You can set the size of an existing index by using PUT _settings
  2. For newly added indexes, you can use a template similar to the one below
PUT _template/global-slowlog_template
{
    "order": 1,"version": 0."template": "*"."settings": {
        "index.indexing.slowlog.threshold.index.debug" : "10ms"."index.indexing.slowlog.threshold.index.info" : "50ms"."index.indexing.slowlog.threshold.index.warn" : "100ms"."index.search.slowlog.threshold.fetch.debug" : "100ms"."index.search.slowlog.threshold.fetch.info" : "200ms"."index.search.slowlog.threshold.fetch.warn" : "500ms"."index.search.slowlog.threshold.query.debug" : "100ms"."index.search.slowlog.threshold.query.info" : "200ms"."index.search.slowlog.threshold.query.warn" : "1s"}}Copy the code

11. TCP sets the purpose of multiple ports

Transport.tcp. port is not written, default is 9300-9399.

  • If you set a port, assuming that the port occupied the program will not be able to start normally;
  • If multiple ports are configured, if one port is occupied, the system searches for the next port until an available port is found.

12. ES restarts temporarily and sets the fragment delay allocation policy

PUT _all/_settings
{
  "settings": {
    "index.unassigned.node_left.delayed_timeout": "5m"}}Copy the code

Third, Kibana

1. Kibana chart custom annotation

You can use TSVB to support annotations.

Kibana TSVB annotations use: elasticsearch. Cn/article / 701

2. Kibana Discover export CSV files

How to quickly export the Document Table of Kibana Discover page to CSV

3. Modify the default home page of Kibana

Elasticsearch. Cn/article / 633…

4. Selected community articles

  • Elasticsearch 6.6 Index Lifecycle Management
  • Data interaction between Hive and ElasticSearch

Any Code, Code Any!

Scan code to pay attention to “AnyCode”, programming road, together forward.