After action-managed-upgrade from queens to rocky with Neutron DVR enabled, neutron-dhcp-agent package is left removed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Compute Charm |
Triaged
|
High
|
Unassigned |
Bug Description
We're upgrading a Neutron DVR cloud from bionic-queens to bionic-rocky.
On the nova-compute service(s) where we performed action-
Workaround is to install neutron-dhcp-agent across the openstack-upgraded nova-compute units.
We are currently checking whether action-
From the logs, we see the following in preparation for the upgrade from py2 to py3:
2021-03-17 17:50:04 DEBUG openstack-upgrade The following packages will be REMOVED:
2021-03-17 17:50:04 DEBUG openstack-upgrade python-ceilometer* python-neutron* python-
(edited)

And it's noted in lp#1828259 related to neutron-l3-agent being removed on DVR clouds upgrading to rocky, the nova-compute charm is supposed to take into account the neutron subordinate's package needs upon package upgrade and re-install the neutron agents removed prior.
https:/
We may need to add neutron-dhcp-agent to that list of package exceptions noted by Corey in the nova-comptue charm.
tags: | added: openstack-upgrade |
Update, when performing a non-action- managed- upgrade, this error state does not occur.
juju config nova-compute openstack- origin= cloud:bionic- rocky action- managed- upgrade= false
User story for needing to perform action- managed- upgrade= true is to be able to know which VMs will be affected at what times when it comes to the openvswitch restarts that happen as part of the queens to rocky upgrade rather than having a random 30-60 minutes where OVS may drop out of all hypervisors.
Considering that our managed service data-plane uptimes presume workloads are spread across multiple availability zones, I wonder if an update in the nova-compute charm to locking upgrades to one AZ at a time would be possible when not using action- managed- upgrade.
Imagine an openstack upgrade of a cloud running a kubernetes cluster on the overlay, we'd want all of the HA services like metallb, kubernetes-master, kubeapi- loadbalancer, etc, that are properly distributed across AZs to not be taken offline, network-wise, at the same time so there aren't split-brain issues in clustered applications.