replace a disk in a raid1 array on linux
Posted on in linuxLast edited on
From time to time, even the best drives go bad and need to be replaced. Here’s my notes for doing this without screwing things up.
Raid1 means we have two drives, for the sake of this note (and because it’s mostly true for my servers), these are
/dev/sda
and /dev/sdb
.
Please: If you follow this note, make sure to double- or tripple-check your device names and consider additional research before taking action on important infrastructure! This is not a tutorial, nor a “definite guide to…” ☝️
In all cases: review configuration
The easiest way for me is always the listing in /proc/mdstat
. It shows you, which disk is assigned to which md device and also, which one has gone bad.
cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdb2[1] sda2[0]
999021888 blocks super 1.2 [2/2] [U_]
bitmap: 7/8 pages [28KB], 65536KB chunk
md0 : active raid1 sdb1[1] sda1[0]
1046528 blocks super 1.2 [2/2] [U_]
unused devices: <none>
In this case, if we follow the lines that start with md1
and md0
, which are the raid devices in this machine, we can see that sdb2[1]
and sda2[0]
are part of the raid device. The notation [U_]
at the end of the next line shows us the missing/down raid member. These have the save order as they are listed in line above, so sda2
is the defective member in md1
and sda1
is the defective member in md0
.
💡 In [U_] , U depicts an “Up” device and _ depicts a “down” or “missing” device. |
---|
Another easy way would be to lookup devices with lsblk
, but that will only work, if the device is still recognized by the os, which they sometimes are not.
If /dev/sda
has gone bad:
fail and remove the disk from the md device
I always do this. Even if the disk I have to replace is not recognized by the os anymore. There’s never been issues with this approach, but I can at least provide anecdotal evidence for it’s help in the matter. 🤷
Fail the disk we want to replace:
mdadm --manage /dev/md0 --fail /dev/sda1
mdadm --manage /dev/md1 --fail /dev/sda2
Now actually remove the disk, you want to replace:
mdadm --manage /dev/md0 --remove /dev/sda1
mdadm --manage /dev/md1 --remove /dev/sda2
You can then replace the faulty disk.
Once the disk is replaced and the server back up and running, continue with the next step:
Copy partition table from healthy disk to new, empty disk
Be careful with this next step! Take your time, check twice. If you mix up the device names, you end up nuking the partition table on the healthy disk. No bueno!
Copy partition table from healty disk (here: /dev/sdb) to new disk (here: /dev/sda):
sfdisk -d /dev/sdb | sfdisk /dev/sda
sfdisk -d
dumps the partition table to stdout. The pipe hands it back into sfdisk on stdin, which then writes the partition table to the new disk.
Remove possibly existing superblocks
With this step we make sure that there are no superblocks on the new disk that contain residual data.
mdadm --zero-superblock /dev/sda1
mdadm --zero-superblock /dev/sda2
Add the new disk to the raid
mdadm -a /dev/md0 /dev/sda1
mdadm -a /dev/md1 /dev/sda2
Your disk should now start resilvering. You can check the status with cat /proc/mdstat
.
Install grub-config on all disks
You never know, which disk fails, so it’s a good idea to have the boot information present on all of them, or booting might fail.
grub-install /dev/sda
grub-install /dev/sdb
Done.
Some notes on NVMe
- NVMe drives are usually named
/dev/nvme0n1
,/dev/nvme1n1
etc. - NVMe partitions are usually named
/dev/nvme0n1p1
,/dev/nvme0n1p2
, where thep
marks the partition - Your OS might require a reboot to recognize the new partitions on a disk