Comment 3 for bug 1424048

Revision history for this message
JuanJo Ciarlante (jjo) wrote :

FYI this has corosync_transport: unicast.

Changed our deployment sequence to deploy the 3 HA
units at once, then relate openstack services, and got
a different issue, on some of them (affected units
changed each time I redeployed , tried couple times):

$ juju run --timeout=10s --service=keystone 'sudo crm status 2>/dev/null|egrep Started:'
- Error: command timed out
  MachineId: 0/lxc/6
  Stdout: ""
  UnitId: keystone/0
- MachineId: 1/lxc/3
  Stdout: ' Started: [ juju-machine-1-lxc-3 juju-machine-2-lxc-3 ]

'
  UnitId: keystone/1
- MachineId: 2/lxc/3
  Stdout: ' Started: [ juju-machine-1-lxc-3 juju-machine-2-lxc-3 ]

'
  UnitId: keystone/2

Logging into the timed out unit shows pacemaker not started,
then hanode-relation-changed looping endlessly on failing
crm node list, after starting pacemaker there the hook could
complete ok: http://paste.ubuntu.com/10454923/