Comment 1 for bug 476161

Revision history for this message
Luke (lukekuhn) wrote :

  I've done more testing, and this is even worse than I thought.

 If a filesystem fails fsck (tested by rolling clock back a week), the recovery shell does NOT spawn and you CANNOT run fsck manually. If I do this on the console, I do not get any message from Init about mountall stopping, only a message about fsck's exit status. On an esc press against a waiting mountall, I get mountall:cancelled instead of a message from init about mountall ending. SAME message if I try to cancel a root filesystem fsck run.

I think two things are happening here:

1: It looks like Mountall isn't actually EXITING, thus no "start on mountall stopped" process launches, and with mountall in an "infinate delayed exit," init simply waits-forever. I put a variety of test scripts in set to start on stopped mountall-and not one of them would launch. I then set a script to start on startup, stop mountall, shut down the splash screen, run /sbin/sulogin , then restart mountall, and this worked as written.
This means you can spawn a shell, but the trigger for mountall-shell.conf (mountall stopped) is not getting sent.

2: On a root filesystem fsck run, an esc press gets detected by mountall and treated as "cancel mountall", while on an non-root filesytem fsck run, mountall stops fsck but is not itself cancelled. Perhaps the internal variable on mountall responding to an Esc press is being improperly set during root fsck runs only?

  If you install karmic on a cheap flash drive or any other circumstance where frequent filesystem corruption is a fact of life, there are two solutions: either set all fsck pass numbers to 0 and check it at intervals on another system, or open /etc/default/rcS and set FSCKFIX=yes . The danger of the latter is data loss in a severe filesystem corruption case before you get to make any decisions about recovering data first, then fixing the filesystem.