swift proxy has inconsistent ring builder across units

Bug #1381040 reported by Edward Hope-Morley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
swift-proxy (Juju Charms Collection)
Fix Released
Critical
Edward Hope-Morley

Bug Description

Deploying Swift from the stable branch with:

* 3 proxy units + hacluster
* 3 storage units

produces inconsistent /etc/swift/*.builder files across proxy units i.e. broken cluster.

Changed in swift-proxy (Juju Charms Collection):
status: New → In Progress
assignee: nobody → Edward Hope-Morley (hopem)
tags: added: cts openstack
Revision history for this message
Ante Karamatić (ivoks) wrote :

Using stable charms and this procedure:

$ juju deploy --to 1 --repository . --config=config.yaml local:trusty/swift-storage swift-zone1
$ juju deploy --to 2 --repository . --config=config.yaml local:trusty/swift-storage swift-zone2
$ juju deploy --to 3 --repository . --config=config.yaml local:trusty/swift-storage swift-zone3
$ juju deploy --to lxc:1 --repository . --config=config.yaml local:trusty/swift-proxy
$ juju add-unit --to lxc:2 swift-proxy
$ juju add-unit --to lxc:3 swift-proxy
$ juju add-relation swift-proxy swift-hacluster
$ juju add-relation swift-proxy swift-zone1
$ juju add-relation swift-proxy swift-zone2
$ juju add-relation swift-proxy swift-zone3
$ juju add-relation swift-proxy glance
$ juju add-relation swift-proxy keystone

Behavior is inconsistent; most of the time it works, but sometimes we have inconsistency across proxies. This is relevant config from yaml:

swift-proxy:
  auth-type: 'keystone'
  use-https: 'False'
  vip: '192.168.0.10'
  vip_iface: eth0
  zone-assignment: 'manual'
  swift-hash: 'random_hash'
swift-zone1:
   block-device: '/dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk'
   overwrite: 'true'
   zone: 1
swift-zone2:
   block-device: '/dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk'
   overwrite: 'true'
   zone: 2
swift-zone3:
   block-device: '/dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk'
   overwrite: 'true'
   zone: 3

We've also tried with manual zone assignment with the same result. Always only one of swift-proxies is able to actually communicate with storage zones.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Ante, quick thought, how many of block-device: '/dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk' actually exist? Is each storage host equal in what devices it has? And is it possible that devices were added and/or removed from any of the storage? Just trying to narrow down the scenario a bit here. This really smells like a case of storage nodes changing (e.g. new devices coming online) but the update not reaching all proxy nodes (so clearly some kind of race condition). We do definitely have a problem with the way we manage .builder files on proxy nodes and need to find a safer way to do it.

Revision history for this message
Ante Karamatić (ivoks) wrote :

All the nodes are exactly the same - same number of disks, same disk sizes, same disk model.

Revision history for this message
Edward Hope-Morley (hopem) wrote :
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Still investigating this. I have managed to reproduce inconsistent .builder files and it looks like even if swift is given the same parameters when building .builder files on multiple proxies, the builder files will not necessarily be the same. So either this is a bug in swift or we just need to build on one proxy only and share across units. Still digging...

tags: added: backport-potential
Changed in swift-proxy (Juju Charms Collection):
status: In Progress → Fix Committed
Changed in swift-proxy (Juju Charms Collection):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.