[SRU] drbd fence-peer breaks when using kernel 2.6.32-41
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
drbd8 (Ubuntu) |
Fix Released
|
Medium
|
Ante Karamatić | ||
Lucid |
Fix Released
|
Undecided
|
Ante Karamatić |
Bug Description
SRU Justification
Upstream commit:
e6cbc43 - http://
Description:
Latest 10.04 kernel (2.6.32-41) fixed an issue described in bug 963685. Cause of this change, drbd module, built with dkms, regressed and can not be used as intended.
Notes (original report):
Ubuntu 10.04 Lucid with 2.6.32-41 kernel and drbd8
Kernel 2.6.32-41 fixed a consistency issue around UMH_WAIT_PROC in this bug:
https:/
This causes the drbd fencing script's exit codes to be incorrectly interpreted which then breaks the drbd fencing:
**** This also affects linux source in all distributions after Lucid with the applicable kernel versions patched in bug 963685 above since the drbd kernel module is mainlined in those more recent kernel versions ****
To replicate:
Have fencing enabled in drbd config:
In handlers section: fence-peer "/usr/lib/
In the disk section: fencing resource-only;
Have both drbd nodes uptodate with one primary one secondary
Make the fence-peer get executed. I did this by:
Having drbd under pacemaker control. Both pacemaker nodes were online and in-sync. Drbd in primary on node 1. Put node 1 in standby. Fence-peer will get executed.
Fence handler will report fence-peer exited with 0 (broken) - such as this:
May 15 09:45:17 kernel: [56645.420714] block drbd0: helper command: /sbin/drbdadm fence-peer minor-0
May 15 09:45:17 kernel: [56645.420920] block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 0 (0x0)
May 15 09:45:17 kernel: [56645.420925] block drbd0: fence-peer helper broken, returned 0
If you log debug output of fence-peer script (crm-fence-peer.sh) when executed it exits 4 not the kernel reported 0.
This commit in drbd git should fix this behavior:
http://
This will cause complete failure of a drbd setup using fencing to auto-recover or continue without manual intervention and repair.
Related branches
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
tags: | added: lucid regression-update |
description: | updated |
Changed in drbd8 (Ubuntu): | |
status: | New → Confirmed |
description: | updated |
Changed in drbd8 (Ubuntu): | |
assignee: | nobody → Ante Karamatić (ivoks) |
description: | updated |
no longer affects: | linux (Ubuntu) |
tags: |
added: verification-done removed: verification-needed |
This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 1000355
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.