Race condition at system-boot: md-RAID not always ready in time
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
udev (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: udev
Almost each time I start the System I get error-messages concerning my md-RAID devices.
Example:
"udevd-work[77]: inotify_
Sometimes only one of my several RAIDs is concerned, sometimes more of them.
If this error shows up for the RAID my root directory is located on, the system won't boot but drop to a shell.
If only data-drives are concerned, the system finishes the boot process normally and all RAID drives are up by then.
There are no remarkable entries in the log.
The problem occurs on all three Lucid Installations I have made so far. (One was an upgrade from Karmic, the next was a fresh install, both on real hardware. The third is the one this report is done with, and this is a fresh Installation in a VirtualBox)
/etc/mdadm/
# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=0.90 UUID=cd601f2c:
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=0.90 UUID=5057c6bb:
blkid:
/dev/sda1: UUID="712d6c10-
/dev/sda5: UUID="696c8371-
/dev/sdb1: UUID="cd601f2c-
/dev/sdc1: UUID="cd601f2c-
/dev/sdd1: UUID="5057c6bb-
/dev/sde1: UUID="5057c6bb-
/dev/md1: UUID="tM2vUv-
/dev/md0: UUID="5MJTwi-
/dev/mapper/
Ubuntu-Release:
Description: Ubuntu 10.04.1 LTS
Release: 10.04
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: udev 151-12
ProcVersionSign
Uname: Linux 2.6.32-24-generic i686
Architecture: i386
CustomUdevRuleF
Date: Mon Jul 26 16:18:20 2010
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta i386 (20100318)
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: innotek GmbH VirtualBox
ProcCmdLine: BOOT_IMAGE=
ProcEnviron:
LANG=de_DE.utf8
SHELL=/bin/bash
SourcePackage: udev
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.modalias: dmi:bvninnotekG
dmi.product.name: VirtualBox
dmi.product.
dmi.sys.vendor: innotek GmbH
Changed in udev (Ubuntu): | |
status: | New → Confirmed |
We also see somehow similar issues here sporadically on a number of machines with 10.04.1 - in our case, we only have a data-RAID which is not necessary for mounting root partition. And this data raid will stay half-way assembled on some boots.
Currently, we think it's caused by udevd being killed in the middle of its operation, see #613273.
If "mdadm --incremental" is interrupted at the wrong moment, it seems to cause a lot of weird issues - ranging from a leftover /dev/.tmp.md.8:xx which makes mdadm bailing out with "Strange error loading metadata for /dev/md0" for all future operations until reboot to completely damaged data structures in the kernel with wrong device numbers, half-busy devices and the like.