Comment 0 for bug 1475247

Revision history for this message
Adam Collard (adam-collard) wrote :

During an Autopilot deployment on gMAAS, Juju had hung running a mon-relation-changed hook

$ ps afxwww | grep -A 4 [m]on-relation-changed
  29118 ? S 0:03 \_ /usr/bin/python /var/lib/juju/agents/unit-ceph-1/charm/hooks/mon-relation-changed
  37996 ? S 0:00 \_ /bin/sh /usr/sbin/ceph-disk-prepare --fs-type xfs --zap-disk /dev/sdb
  37998 ? S 0:00 \_ /usr/bin/python /usr/sbin/ceph-disk prepare --fs-type xfs --zap-disk /dev/sdb
  38016 ? D 0:00 \_ /sbin/sgdisk --zap-all --clear --mbrtogpt -- /dev/sdb

This had been in this state for > 10m. The logs[1] from the unit in question showed that something was up with the partition tables on that disk.

I fixed this by hand using gdisk[2]

[1] https://pastebin.canonical.com/135426/
[2] http://paste.ubuntu.com/11887096/