Replace a disk in RAID

Replace a disk in RAID

Exchanging hard disks in a Software-RAID - Hetzner Docs

Let’s assume that defective drive is /dev/nvme1

What is what?

  1. RAID partitions: /dev/md0, /dev/md1, /dev/md2
  2. Physical disks: /dev/nvme0n1, /dev/nvme1n1 or /dev/sda, /dev/sdb
  3. Partitions on physical disks: /dev/nvme0n1p1, /dev/nvme0n1p2 or /dev/sda1, /dev/sda2

Examine current state

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# find the HW RAID controller, model if any
lspci | grep RAID

# List devices
ls -1 /dev/nvme*

# Check the RAID configuration with
cat /proc/mdstat

# List all partitions on all drives
cat /proc/partitions

# List RAID partitions
fdisk -l

Notes on on NVME drives

Non-Volatile Memory Express (NVMe) is a storage interface introduced in 2013.

Nvme0 vs nvme0n1

Naming scheme:

/dev/nvme<CONTROLLER_NUMBER>n<NAMESPACE>p<PARTITION>

NVMe has the concept of namespaces. The character device /dev/nvme0 is the NVME device controller, and block devices like /dev/nvme0n1 are the NVME storage namespaces: the devices you use for actual storage which will behave essentially as disks.

NVMe at Hetzner Docs

Tools needed

apt install nvme-cli

Let’s change the drive

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# First check drives health
#   Show physical SMART disk info and check if drive is PASSED
#
smartctl -x /dev/nvme0n1
smartctl -x /dev/nvme1n1

# Shows the drives that are part of an arrays
mdadm --detail /dev/md0
mdadm --detail /dev/md1
mdadm --detail /dev/md2

# Remove defective drive
#   Old defective drive needs to be removed from the RAID array and this must to be done for each individual partition. 
#
# mdadm /dev/md0 -r /dev/nvme1n1p1
# mdadm /dev/md1 -r /dev/nvme1n1p2
# mdadm /dev/md2 -r /dev/nvme1n1p3

# My drives are MBR, not GPT
#
# copy MBR partition table from left one to right
sfdisk -d /dev/nvme0n1 | sfdisk /dev/nvme1n1

# just in case, reboot now for changes to be valid

# Add new parititons into RAID array
#
mdadm /dev/md0 -a /dev/nvme1n1p1
mdadm /dev/md1 -a /dev/nvme1n1p2
mdadm /dev/md2 -a /dev/nvme1n1p3

# Check rebuild
cat /proc/mdstat

# Watch it rebuild
watch -n1 cat /proc/mdstat

# Speed up RAID rebuild
sysctl dev.raid.speed_limit_max
sysctl -w dev.raid.speed_limit_max=9000000

# Due the serial number change, we need to generate a new device-map:
grub-mkdevicemap -n
date 01. Jan 0001 | modified 28. May 2021
filename: Task » Replace a disk in RAID