Changes to Lucene remain on disk only for the duration of a Lucene commit, which is a relatively expensive operation and therefore cannot be performed after each index or delete operation. In the event of a process exit or hardware failure, Lucene removes changes from the index that occur after one commit and before another.

Lucene’s commit is too expensive to perform each individual change operation, so each shard copy also writes the operation to its transaction log, called translog. All index and delete operations are written to the transaction log after being processed by the internal Lucene index but until confirmed. In the case of a crash, when sharding is recovered, recent operations that have been confirmed but not included in the last Lucene commit are recovered from the transaction log.

Elasticsearch Flush is the process of performing a Lucene commit and starting the generation of a new Translog. The refresh is performed automatically in the background to ensure that the Translog does not become too large, which can make the replay operation during the recovery take a lot of time. The ability to perform a refresh manually is also done through the API, although this is rarely needed.

Translog settings

The data in the transaction log is persisted to disk only when the transaction log is synchronized and committed. In the event of a hardware failure, an operating system crash, a JVM crash, or a sharding failure, all data written since the last transaction record was committed is lost.

By default, the index. The translog. Durability is set to the request, this means that Elasticsearch only in the success of the fragmentation and copy each allocated for transaction synchronization and commit the transaction, to the index, delete, update, or the success of the batch request report to the client. If the index. The translog. Durability is set to the async, Elasticsearch synchronization and commit the transaction log only at the end of each index. The translog. Sync_interval execution, which means that when node recovery, Any operations performed prior to the crash may be lost.

The following dynamically updatable Settings for each index control the behavior of the transaction log:

Index.translog. sync_interval: How often translogs are synchronized to disk and committed, regardless of the write operation. The default value is 5s. Values less than 100ms are not allowed.

Index. The translog. Durability: at the end of each index, delete, update, or whether the batch request synchronization and commit the transaction log. This setting accepts the following parameters:

  • Request :(default) synced and submitted after each request. If a hardware failure occurs, all confirmed writes will have been committed to disk.
  • Async: Synchronizes and commits once per sync_interval. If a failure occurs, all confirmed writes since the last auto-commit are discarded.

Index.translog.flush_threshold_size: Transaction log stores all operations in Lucene that have not been securely persisted (that is, not part of the Lucene commit point). Although these operations can be read, they need to be replayed if sharding has been stopped and must be resumed. This setting controls the maximum total size of these operations to prevent recovery from taking too long. Once the maximum size is reached, a refresh is performed to generate a new Lucene commit point. The default value is 512mb.

See the website: www.elastic.co/guide/en/el…

Translation is not allowed to ask for more advice, translation is not easy do not embezzle, such as use, please indicate the source