Recovery shell does not spawn if mountall stopped by esc (mountall 1.0)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cryptsetup (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: mountall
Source package: mountall 1.0
Ubuntu package: mountall 1.0 (all I get from apt-cache policy mountall)
Test System: MSI Wind U100 (Intel Atom), booting both from hard disk and from a slow flash drive with Karmic installed
Expected Actions:
On a hung mountall with dependant mounts(bad passphrase to deny access to encrypted /home), expected pressing "esc" to stop usplash(if running) and spawn recovery shell
On pressing exc during root partition fsck, expected cancellation of checks, continuation of boot(this works on non-root partitions). Would accept continuation of fsck with esc entirely ignored.
In every instance I can generate where mountall exits due to an error, I would expect mountall-shell.conf to in fact spawn a recovery shell, no matter what stopped mountall from finishing
Actual results:
Instead, in every test I have tried , pressing esc does not spawn a shell, and if usplash is running, it does not exit but continues. I have not been able to force an fsck failure as the clock bug is fixed, and a bad passphrase on the LUKS
partition generates the unavailable dependant mount/unavailable swap device open-ended wait, with option to press esc. Sometimes I have been able to bring up tty8, and get a reference to mountall exiting 8-but NO recovery shell, just a console ignoring all input.
It appears that mountall-shell is not triggering when mountall stops, leaving the boot process hung and not responsive to anything but reset. sudo initctl -list reveals a list of processes including "mountall-shell stop/waiting" and "mountall-reboot stop/waiting" , but when the interactive recovery shell should spawn it does not.
Oh well, I guess that gets around the vulnerability to a "esc spam root shell" which for obvious reasons I cannot duplicate, but also means recovery from any error sufficient to force mountall to exit might also have the same problem, leaving systems unbootable to those without live discs or other recovery media.
Until this is fixed, I recommend keeping a flash drive with Ubuntu Jaunty on it handy if you carry a laptop, and at least a live disc if not the (faster) recovery flash drive around your desktop, as you will not be able to get around this and boot otherwise if anything goes wrong with mounting your filesystems. Of course, this means all encrypted /home systems no longer have the ability to boot to an unencrypted backup, empty /home directory if, say, the LUKS mapping gets corrupted. That I have tested-without the shell you cannot even manually emit "filesystem" to get the boot to continue.
I've done more testing, and this is even worse than I thought.
If a filesystem fails fsck (tested by rolling clock back a week), the recovery shell does NOT spawn and you CANNOT run fsck manually. If I do this on the console, I do not get any message from Init about mountall stopping, only a message about fsck's exit status. On an esc press against a waiting mountall, I get mountall:cancelled instead of a message from init about mountall ending. SAME message if I try to cancel a root filesystem fsck run.
I think two things are happening here:
1: It looks like Mountall isn't actually EXITING, thus no "start on mountall stopped" process launches, and with mountall in an "infinate delayed exit," init simply waits-forever. I put a variety of test scripts in set to start on stopped mountall-and not one of them would launch. I then set a script to start on startup, stop mountall, shut down the splash screen, run /sbin/sulogin , then restart mountall, and this worked as written.
This means you can spawn a shell, but the trigger for mountall-shell.conf (mountall stopped) is not getting sent.
2: On a root filesystem fsck run, an esc press gets detected by mountall and treated as "cancel mountall", while on an non-root filesytem fsck run, mountall stops fsck but is not itself cancelled. Perhaps the internal variable on mountall responding to an Esc press is being improperly set during root fsck runs only?
If you install karmic on a cheap flash drive or any other circumstance where frequent filesystem corruption is a fact of life, there are two solutions: either set all fsck pass numbers to 0 and check it at intervals on another system, or open /etc/default/rcS and set FSCKFIX=yes . The danger of the latter is data loss in a severe filesystem corruption case before you get to make any decisions about recovering data first, then fixing the filesystem.