kafka segment

What is kafka segment

To make it easier to manage data in Kafka Replication, Kafka breaks a Replication into multiple data segments of equal size. Each of these segments is called a segment by Kafka

The size of each segment file is the same, but the number of messages is not necessarily the same. This feature facilitates the clearing of consumed messages and improves disk utilization

The composition of the segment

A segment corresponds to multiple files on disk

*.index: indicates the offset index file of the message
*.timeindex: timeindex file for messages (added in version 0.8)
*.log: message data

There may be more

*. Snapshot: Records the transaction information of producer. (todo)
*. Swap: Used for Segment recovery. (todo)
*.txnIndex file, which records the interrupted transaction information. (todo)

Here we focus on the first three

File naming

The file name of the segment uses the offset of the first message in the segment. The length of the segment is 19 bits, and 0 is added before the segment

The global segment number starts from 0. The name of each subsequent segment file is the offset value of the last message in the preceding segment file

MAX_VALUE = 2 ^ 63-1, which is the maximum number of messages that can be stored in a segment. Kafka limits the size of the segment. So that’s rarely the case

Kafka sparse index

Adding indexes to data is a common method for fast queries, and Kafka is no exception, but kafka uses sparse indexes

A common index, such as an index in a relational database, is a partial field that holds each record in a data set. However, a sparse index is called a sparse index because it does not build an index for each record, but for each segment

The biggest advantage of sparse indexes is that they save disk space, but the cost is that the search speed is not as fast as normal indexes, because you can only find a rough range, and to find specific records, you need to traverse the data of the range, which is equivalent to trading time for space

Because Kafka is designed to read and write to disks sequentially, traversing the interval does not have a significant impact on speed, and choosing sparse indexes can save a lot of disk space.

Kafka’s use of sparse indexes

Kafka creates two sparse indexes for message data: a.index sparse index for offset lookup, and a.timeindex sparse index for timelookup

Segment Contents of each file

You can view the contents of files such as.log using the tools provided with Kafka

Kafka installation directory/bin/kafka - run - class. Sh kafka. View DumpLogSegments -- files. / XXX. LogCopy the code

Open a.index file and see something similar to the following

offset: 53 position: 4124
offset: 106 position: 8264
...
offset: 1302 position: 103050
offset: 1354 position: 107210
Copy the code

Offset: indicates the offset of the message
Position: physical address of the message on the disk

The. Timeindex file content is similar to the following

timestamp: 1547557706588 offset: 53 
timestamp: 1547557707588 offset: 106
...
timestamp: 1547557716601 offset: 1302
timestamp: 1547557717179 offset: 1354
Copy the code

Timestamp: Timestamp of the message
Offset: indicates the offset of the message

The. Log file content is similar to the following

offset: 1301 position: 102970 CreateTime: 1547557716588 payload: message_1301
offset: 1302 position: 103050 CreateTime: 1547557716601 payload: message_1302
offset: 1303 position: 103130 CreateTime: 1547557716612 payload: message_1303
offset: 1304 position: 103210 CreateTime: 1547557716624 payload: message_1304
...
offset: 1353 position: 107130 CreateTime: 1547557717167 payload: message_1353
offset: 1354 position: 107210 CreateTime: 1547557717179 payload: message_1354
offset: 1355 position: 107290 CreateTime: 1547557717183 payload: message_1355
Copy the code

The content of the above message may vary from version to version, but the core remains the same

Sparse index lookup process

Take the.index file as an example. For example, I want to view the message whose offset is 1365

First of all, since the data is searched by offset, we can traverse all partitions under the topic to check which partition 1365 is in. Index file name after finding the partition, we can traverse all the.index file name of the partition to find which file 1365 is in, and then read the content of the file. 1365 is in the range 1250-1500, so start reading the hard disk from position 10897 in 1250 until you find the message whose offset is 1365

The.timeindex file is the same, except that the result is offset, followed by the.index index

Segment of the cleaning

In Kafka has a dedicated log delete task to detect and remove periodically does not meet the requirements for the retention of segment files, they can through the broker of the cycle parameters of the retention. Check. Interval. Ms to configure, The default value is 300000 milliseconds, or 5 minutes.

There are three retention policies for log segmentation:

Time-based retention policies
Retention policies based on log size
Retention policy based on log start offset

The configuration of the segment

The following table describes the segment configuration in server.properties

Log.segment. bytes=107370 ## Retention duration (default: 7 days); log.retention. Hours =168 # The maximum amount of data to be retained is this. Log. Retention. Bytes =1073741824Copy the code

The reason why Kafka reads fast

Kafka achieves O(1) read speed through the following five layers

Topic -> message partition -> data segment -> Gap index -> Sequential read and write