Allowing duplicate secgroups via neutron breaks 2.0.3 models with 2.1 controllers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
Critical
|
Ian Booth | ||
2.1 |
Fix Released
|
Critical
|
Heather Lanigan |
Bug Description
Last night, the Juju 2 controllers in PS4.5 were upgraded to Juju 2.1 (I think 2.1.1). Some of the models are still on 2.0.3. I don't know for sure that this bug only affects the older models, but that's all the data I have so far.
My first attempted deployment after the controller upgrade failed, with "juju status --format=yaml" reporting (elided the silly byte-by-byte UserData serialisations for length; more complete version in https:/
"359":
juju-status:
current: down
message: agent is not communicating with the server
since: 08 Mar 2017 13:52:50Z
instance-id: pending
machine-status:
current: provisioning error
message: |-
cannot run instance: failed to run a server with nova.RunServerO
caused by: request (http://
since: 08 Mar 2017 13:48:50Z
series: xenial
"360":
juju-status:
current: down
message: agent is not communicating with the server
since: 08 Mar 2017 13:52:52Z
instance-id: pending
machine-status:
current: provisioning error
message: |-
cannot run instance: failed to run a server with nova.RunServerO
caused by: request (http://
since: 08 Mar 2017 13:49:30Z
series: xenial
I tried removing the new secgroup, but that didn't help; juju just put it right back on the next run. I then tried removing the old secgroup. That required replacing it on all machines, and for some reason this mysteriously caused the units within the deployment to be unable to send any packets to each other. Chris Stratford eventually suggested the workaround of running "juju add-unit" on a random application, which apparently caused juju to fix up the inter-unit networking although we couldn't work out exactly what it had done.
Since all that, I've heard the same report from two other people, so I investigated further. I tracked down commit 3e8e8e5c9c68237
- defaultGroup, err := c.environ.
+ // Security Group Names in Neutron do not have to be unique. This
+ // function returns an array
+ defaultGroups, err := c.environ.
- group, err := novaClient.
+ groupsFound, err := neutronClient.
if err == nil {
- // Group exists, so assume it is correctly set up and return it.
- // TODO(jam): 2013-09-18 http://
- // We really should verify the group is set up correctly,
- // because deleting and re-creating environments can get us bad
- // groups (especially if they were set up under Python)
- return *group, nil
+ for _, group := range groupsFound {
+ if c.verifyGroupRu
+ return group, nil
+ }
+ }
}
So I suspect that a better workaround in this situation would be to delete the new secgroup and add the correct rules (possibly the IPv6 rules, which weren't present in the old secgroup) to the old secgroup using neutron. However, this seems to be a pretty bad incompatibility. Perhaps juju should be more conservative and use the old "find existing group and append to it" behaviour with neutron just as it used to do with nova, even if neutron itself is happy with duplicate secgroup names?
Changed in juju-core: | |
assignee: | nobody → Heather Lanigan (hmlanigan) |
affects: | juju-core → juju |
Changed in juju: | |
milestone: | none → 2.2-alpha1 |
importance: | Undecided → High |
status: | New → Triaged |
Changed in juju: | |
status: | Triaged → In Progress |
Changed in juju: | |
importance: | High → Critical |
tags: | added: openstack-provider uosci |
tags: | added: upgrade-juju |
tags: | added: eda |
Changed in juju: | |
milestone: | 2.2-alpha1 → 2.2-beta1 |
Changed in juju: | |
status: | Fix Committed → Fix Released |
I believe that even Juju 2.1 runs nova with the security group name rather than its ID, so it seems to me that this will probably break even with 2.1 on the model, but I can't prove that at the moment. My guess is that you can reproduce by creating an OpenStack-based model with 2.0, then upgrading the controller to 2.1, then trying to deploy anything more to the model.