The beginning of the story

  1. A disk alarm was reported on a kafka cluster node. I immediately checked the retention time of a topic with a large amount of data. I found that the retention time has been set to the minimum.
  2. After communicating with r&d colleagues, it turned out that there was a new business volume, and R&D hoped to solve the problem by expanding node disks. Okay, I’ll see how to expand it…
  3. Unfortunately, I found that the data directory of this node actually uses the system disk. Expanding the capacity means restarting the server. Come on, I’ve just started work and I’m afraid something might happen. Is there a more elegant way?
  4. By checking the monitoring, I found that the disk space of several other nodes in the cluster was still quite free. Do you feel that the distribution of topics is a little uneven? Is it possible to migrate a partitioned copy of a Topic to another idle node? That should solve the problem, right?

Try to solve

Let’s start with some background information

  1. Clusters and Involved Topics:
  • Service version: 0.10.2
  • Number of nodes: 5
    • broker-0
    • broker-1
    • broker-2
    • broker-3
    • broker-4
  • Topic name: message
  • Number of partitions: 3
  • Number of copies: 2
  1. Take a look at the storage of broker-1 at the time of the problem, which is approaching 100% :
[root@broker-1 bin]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 197G 181G 7.9g 96% / devtmpfs 7.8g 0 7.8g 0% /dev TMPFS 7.8g 24K 7.8g 1% /dev/shm TMPFS 7.8g 520K 7.8g 1% /run TMPFS 7.8g 0 7.8g 0% /sys/fs/cgroup TMPFS 1.6g 0 and 1.6 G/run/user / 10001 0%Copy the code
  1. Broker-3 nodes have no partitioned copies assigned to this topic, while Broker-1 has two partitioned copies:
Topic:message PartitionCount:3 ReplicationFactor:2 Configs:retention.ms=172800000,max.message.bytes=5242880 Topic: Leader Partition: 0 Replicas: 4,0 Isr: 4,0 Topic: Message Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1 Topic: message Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2Copy the code

Searching for solutions

  1. A search for “kafka partition replica migration “led to a number of articles on the web.
  2. After reading a few articles, I concluded that Kafka could passkafka-reassign-partitions.shThis script performs manual migration.
  3. However, before performing the script migration, you need to prepare a JSON file that specifies the topic to be migrated, the partition, and the target broker-ID of the replica.

Before starting the operation, I am very timid with the test machine to run again, to ensure that there is no problem to dare to go online 😅

Start operation

  1. Write configuration file first, I want partition-1 copy from partition0, 1Two nodes are migrated to0, 3Go, so the configuration of the other two partitions remains unchanged.
$vim partitions-topic.json {"partitions": [{"topic": "message", "partition": 0, "replicas": [4,0]}, {"topic": "Message ", "partition": 1, "replicas": [0,3]}, {"topic": "message", "partition": 2, "replicas": [1,2]}, "version":1}Copy the code

You can only fill in the partition to be migrated, and do not migrate the unfilled partition

  1. Run the migration script to start the migration operation according to the file.
[root@broker-1 bin]#./kafka-reassign-partitions. Sh --zookeeper 127.0.0.1:2181 --reassignment-json-file partitions-topic.json --execute Current partition replica assignment {" version ": 1," partitions ": [{" topic" : "message", "partition" : 0, "replicas" : [4, 0]}, {" topic ":" message ", "partition" : 2, "replicas ":[1,2]},{"topic":"message","partition":1,"replicas":[0,1]}]} Save this to use as the --reassignment-json-file option during rollback Successfully started reassignment of partitions.Copy the code
  1. Viewing the Migration
[root@broker-1 bin]# ./kafka-reassign-partitions.sh --zookeeper 127.0.0.1:2181 --reassignment-json-file partitions-topic.json --verify
Status of partition reassignment:
Reassignment of partition [message,0] completed successfully
Reassignment of partition [message,1] is still in progress
Reassignment of partition [message,2] completed successfully
Copy the code

As you can see from the output, only partition 1 is performing the migration operation. The other two partitions are shown to have completed since no changes have been made.

Verify migration

  1. Check to see if the corresponding data directory is generated on the broker-3 node.
[root@broker-3 kafka_log]# ll message-1 -d
drwxr-xr-x 2 root root 4096 Jun  2 15:35 message-1
Copy the code

It has indeed been created and is constantly generating data.

  1. Wait a while and check the migration again.
[root@broker-1 bin]# ./kafka-reassign-partitions.sh --zookeeper 127.0.0.1:2181 --reassignment-json-file partitions-topic.json --verify
Status of partition reassignment:
Reassignment of partition [message,0] completed successfully
Reassignment of partition [message,1] completed successfully
Reassignment of partition [message,2] completed successfully
Copy the code

Partition -1 indicates that the migration is complete.

  1. At this point, you can view topic details in the script and find that the copy of partition-1 has been migrated to0, 3Two nodes.
/kafka-topics. Sh --zookeeper 127.0.0.1:2181 --describe --topic message topic :message PartitionCount:3 ReplicationFactor:2 Configs:retention.ms=172800000,max.message.bytes=5242880 Topic: message Partition: 0 Leader: 4 Replicas: 4,0 Isr: 4,0 Topic: message Partition: 1 Leader: 0 Replicas: 0,3 Isr: 0,3 Topic: Message Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2Copy the code
  1. Check the disk status of the broker-1 node. Indeed, space has been freed. Problem solved! (Sigh of relief 😂)
[root@broker-1 bin]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 197G 136G 54G 72% / devtmpfs 7.1g 0 7.8g 0% /dev/shm TMPFS 7.8g 24K 7.8g 1% /dev/shm TMPFS 7.8g 520K 7.8g 1% /run TMPFS 7.8g 0 7.8g 0% /sys/fs/cgroup TMPFS 1.6g 0 1.6 G/run/user / 10001 0%Copy the code

Summary of the lesson

  1. When deploying new services, attach data disks to the server to avoid difficult capacity expansion problems.
  2. When the service is normal, you should think more about what problems will occur, rehearse in advance, and avoid cramming at the last minute and scaring yourself.