Using soft RAID to achieve high availability of system disk

Original address: https://www.tony-yin.site/201…

A perfect system is theoretically there should not be any single point of failure, but the system disk is often ignored, and the system disk is exactly the most important point. This paper mainly explains how to use soft RAID to achieve high availability of system disk, and realize automatic disk change, automatic alarm and automatic recovery.

System disk composition

All the mount points adopt RAID1 mode to ensure data redundancy. Even if one of the disks is damaged, the normal operation of the operating system will not be affected. You only need to replace a new disk to re-synchronize the data.

Mount Point	Raid	Size
/	Raid1	100 GB
/boot	Raid1	512 MB
/boot/efi	Raid1	200 MB
swap	Raid1	50 GB
/var/log	Raid1	50 GB

System disk soft RAID configuration

Go to the boot page and select the UEFI installation method, because the traditional BIOS method has limitations in both capacity and partition. For details, please read [talk about BIOS, UEFI, MBR, GPT, GRUB…] .

UEFI has an ESP (EFI System Partition), that is, /boot/ EFI partition, and the RAID level is set to RAID1.

The rest of the mount points should also be selected as RAID1.

SDA 8:000 447.1G 0 Disk ├─ Bass Exercises - SDA4 8:400201m 0 Part ├─ Md123 9:1230 0201m 0 Raid1 / Boot/EFE - SDA2 8:2050.1G 0 Part ├─ Md123 9:1230 0201m 0 Raid1 / Boot/EFE └ ─ md127 9:12 7 raid 1 0 0, 50.1 G/SWAP ├ ─ sda5 and 50 G 0 0 part │ └ ─ md124 9:12 4 raid 1 0 0 to 50 G/var/log ├ ─ sda3 and 513 m 0 0 Part sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma, sigma Bass Exercises - SDB4 8:40 201M 0 Part ├─ Md123 9:30 0201m 0 raid1 /boot/efi - SDB2 8:20 50.1g 0 Part Md127 9:70 50.1g 0 raid1 [Swap] ├─ Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises - Bass Exercises 9:060512.4M 0 RAID1 / Boot Exercises - SDB1 8:10 0100.1G 0 Part Trace Exercises - MD125 9:050100G 0 RAID1 /

In plate process

Take the system disk as SDA and SDB, and SDB as the replacement hard disk for example.

Unplug the disc & insert the disc

Because all mount points are RAID1, an array with data redundancy in a soft RAID allows one of the disks to be lost, so there is no disk footprint and thus no out-of-order disk problem. So you can directly hot plug for the disc.

Clones disk partition table information

The newly inserted SDB is theoretically non-partitioned, so the partition above the SDA needs to be completely cloned. Disks for GPT disk partitioning tables should use parted or sgdisk tools.

Sdb-r /dev/sdb/dev/sda sleep 5

It is better to sleep for a few seconds because the underlying synchronization is not done immediately after it is cloned.

Generate a new GUID

After the partition information is cloned, a new GUID is generated for the SDB. Otherwise, the GUID of the SDB will be the same as that of the SDA because the partition table is cloned.

sgdisk -G /dev/sdb

The kernel reloads the partition table

partprobe /dev/sdb

Copy bootstrap

Note:

This step is critical, arguably the most critical and most easily overlooked step of all processes. Because soft RAID data is redundant, it will not be effective for the operating system bootloader, that is, RAID1 will not be redundant for the MBR in the BIOS, nor will it be redundant for the ESP partitions in the UEFI. By not redundant, I mean that soft RAID does not do data redundancy to it, and requires additional redundancy.

If you boot with a traditional BIOS, you need to copy the MBR, the first 512 bytes of the hard disk.

[root@ ~]# dd if=/dev/sda of=/dev/sdb bs=512 count=1

It should be noted that we are using UEFI boot mode, which is completely different from BIOS, so copying the first 512 bytes of the hard disk will not take effect. The UEFI bootstrap is in ESP and needs to copy the entire ESP partition.

[root@ ~]# dd if=/dev/sda of=/dev/sdb

Did you think this was the end?

The UEFI boot is not enough to just copy the ESP partition, you also need to add the system disk to the boot entry. Because when a disk is unplugged and replugged, the original disk will be removed from the boot item, and a new disk will need to be added to the boot item.

[root@ ~]# efibootmgr -c -g -d /dev/sdb -p 1 -L "Centos #2" -l '\EFI\centos\grubx64.efi'

To learn more about efibootmgr, read:

Efibootmgr wiki
Manage UEFI boot entries with EFIBOOTMGR and add missing boot entries

Data synchronization

By adding the partitions of the replaced disks to RAID1, the data from the SDA can be synchronized to the SDB. Once the synchronization is complete, all the arrays will have the effect of data redundancy.

[root@ ~]# mdadm /dev/md123 -a /dev/sdb4
[root@ ~]# mdadm /dev/md124 -a /dev/sdb5
[root@ ~]# mdadm /dev/md125 -a /dev/sdb1
[root@ ~]# mdadm /dev/md126 -a /dev/sdb3
[root@ ~]# mdadm /dev/md127 -a /dev/sdb2

Synchronization profile

Each time you modify a soft RAID, you update the configuration file in real time so that you can view the RAID configuration or reassemble the array from the configuration file.

[root@ ~]# mdadm -Ds > /etc/mdadm.conf

Get the progress value

If there is any data synchronization, there will be the word recovery, and [1/2] indicates not yet synchronized, [_U] indicates that the previous device is not active, the latter device is active. So the progress value of the Recovery peer is not the overall RAID synchronization progress value, but the current RAID progress value. The synchronization progress value of All arrays can be calculated using Finish Blocks/All Blocks.

[root@ ~]# cat /proc/mdstat Personalities : [raid0] [raid1] md123 : Active raid1 [1] sda2 [0] 205760 blocks super 1.0 [1/2] [_U] bitmap: 0/1 pages [0KB], 5536KB Chunk MD124: Active RAID 1 [0] SDB2 [1] Blocks Super 1.2 [1/2] [_U] Bitmap: 1/1 pages [4KB], 5536KB Chunk MD125: active raid1 sda1[0] sdb1[1] 83886080 blocks 64K chunks 2 near-copies [1/2] [_U] [=======>........] Recovery =0.1min (29863444/83886080) Finish =0.1min speed=93472K/ SEC Bitmap: 1/1 pages [4KB], 65536KB chunk MD126: Active RAID 1 [1] SDA2 [0] 524736 Blocks Super 1.2 [1/2] [_U] Bitmap: 0/1 [0KB], 5536KB Chunk MD127: Blocks Super 1.2 [1/2] [_U] Blocks Super 1.2 [1/2] [_U] Blocks 1/1 pages [4KB], 65536KB chunk unused devices: <none>

Monitoring & Alert

In order to achieve better maintenance of the system disk, monitoring and alarm are essential.

Disk health alert

You can use the SmartCTL tool to obtain the disk health status. The health information obtained from different types of disks may be inconsistent. If the disk is healthy, it will generally return PASSED or OK.

[root@~]# smartctl-h /dev/sda smartctl 6.2 2017-02-27r4394 [x86_64-linux-4.14.78-2011.el7.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART STATUS RETURN: incomplete response, ATA output registers missing SMART overall-health self-assessment test result: PASSED Warning: This result is based on an Attribute check.

Disk plug alarm

Through udev mechanism, write the rules file of add and remove two actions to listen for disk pull or insert events, and then call the alarm interface.

[root@ ~]# cat /etc/udev/rules.d/50-ssd-monitor.rules KERNEL=="sd[a-z]+$", ACTION=="remove", SUBSYSTEM=="block", Run +="/usr/bin/python /usr/lib/python2.7/site-packages/disk_watcher/os_disk.py %k pullout" KERNEL==" SD [a-z]+$", ACTION=="add", SUBSYSTEM=="block", RUN + = "/ usr/bin/python/usr/lib/python2.7 / site - packages/disk_watcher/os_disk. Insert p y % k"

conclusion

This paper mainly introduces how to use the soft RAID system disk high availability, in one of the system disk is damaged how to change the disk and data synchronization to do a detailed description, but also to the monitoring alarm did explain. In general, the whole assembly line is basically covered, and the specific details need more practice.

Refer

How To Copy a GPT Partition Table to Another Disk using sgdisk
SUSE’s soft RAID after a hard disk failure repair method
How to install Ubuntu 14.04/16.04 64-bit with a dual-boot RAID 1 partition on an UEFI/GPT system?
mdadm: device or resource busy
Linux hard disk character allocation
UEFI via software RAID with mdadm in Ubuntu 16.04
What’s the difference between creating mdadm array using partitions or the whole disks directly
XenServer 6.2 with Software RAID
UEFI boot fails when cloning image to new machine
How to correctly install GRUB on a soft RAID 1?
How to boot after RAID failure (software RAID)?
mdadm raid 1 grub only on sda
Can the EFI system partition be RAIDed?
Partitioning EFI machine with two SSD disks in mirror