Systems with mutiple disk controllers may fail on boot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
mdadm |
New
|
Undecided
|
Unassigned |
Bug Description
mdadm shouldn't probe disks until all controllers are online.
Notice how last two drives are two events behind.
sudo mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Raid Level : raid0
Total Devices : 4
Persistence : Superblock is persistent
State : inactive <<<<-----
Name : hostname:1 (local to host hostname)
UUID : a9a651a4:
Events : 7747
Number Major Minor RaidDevice
- 8 98 - /dev/sdg2
- 8 114 - /dev/sdh2
- 8 146 - /dev/sdj2
- 8 162 - /dev/sdk2
ubuntu@
/dev/sdg2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : a9a651a4:
Name : hostname:1 (local to host hostname)
Creation Time : Mon Dec 3 18:02:12 2018
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 7812235264 (3725.16 GiB 3999.86 GB)
Array Size : 7812235264 (7450.33 GiB 7999.73 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : 980085a5:
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Jun 12 18:04:53 2019
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : ba15e561 - correct
Events : 7747 <<<<-----
Layout : near=2
Chunk Size : 512K
Device Role : Active device 0
Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
ubuntu@
/dev/sdh2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : a9a651a4:
Name : hostname:1 (local to host hostname)
Creation Time : Mon Dec 3 18:02:12 2018
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 7812235264 (3725.16 GiB 3999.86 GB)
Array Size : 7812235264 (7450.33 GiB 7999.73 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : 40ea4305:
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Jun 12 18:04:53 2019
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : d5020aa3 - correct
Events : 7747 <<<<------
Layout : near=2
Chunk Size : 512K
Device Role : Active device 1
Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
ubuntu@
/dev/sdj2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : a9a651a4:
Name : hostname:1 (local to host hostname)
Creation Time : Mon Dec 3 18:02:12 2018
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 7812235264 (3725.16 GiB 3999.86 GB)
Array Size : 7812235264 (7450.33 GiB 7999.73 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : abfe40a8:
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Jun 12 18:04:48 2019
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : f6373178 - correct
Events : 7745 <<<<------
Layout : near=2
Chunk Size : 512K
Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
ubuntu@
/dev/sdk2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : a9a651a4:
Name : hostname:1 (local to host hostname)
Creation Time : Mon Dec 3 18:02:12 2018
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 7812235264 (3725.16 GiB 3999.86 GB)
Array Size : 7812235264 (7450.33 GiB 7999.73 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : b26fe395:
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Jun 12 18:04:48 2019
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : d9566329 - correct
Events : 7745 <<<<-----
Layout : near=2
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
Work around
sudo mdadm --detail /dev/md1
sudo mdadm --examine /dev/sdh2
Look at events for all drives. If they are out of sync by 1-4 events then we should be ok.
There are a few less aggressive ways to reassemble the disk but this one is the only one
that actually seems to work.
sudo mdadm --create /dev/md1 -l 10 -n 4 /dev/sdg2 /dev/sdh2 /dev/sdj2 /dev/sdk2
It is always a good idea to check the drive status
sudo sudo mdadm --detail /dev/md1
If you see removed drives then add them to array.
Add a dropped drive
sudo mdadm --manage /dev/md1 --add /dev/sdk2