octavia loadbalancer refuses to manually failover

Bug #1878029 reported by Dan Ackerson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
octavia (Ubuntu)
New
Undecided
Unassigned

Bug Description

On 2020-04-09, we discovered that the failover command creates a new LB instance, kills the old instance (which does not exist), and then fails to plug a network port as there is already an ovs port on the host for some reason.

This was discovered during a failed migration attempt using an amphora image, so an attempt was made to use the failover command. We run the Octavia loadbalancer with only a single amphora (standalone).

Some of the octavia units were missing the configuration for rabbitmq, most likely since October 2019, which was fixed when we discovered it.

SEG escalation found the following:
- The duplicate db entry of port bindings causes a `loadbalancer failover` to fail similarly to what is reported here
- Removing the additional ml2_port_bindings entry does not fix the issue

Tags: bootstack
Revision history for this message
Dan Ackerson (dan.ackerson) wrote :

SEG comment (from 2020-04-21):

The OpenStack CLI can not be used to remedy the database state. Any
attempt to operate on the VRRP port fails because Neutron does not
expect two entries with identical port IDs.

Manual removal of the additional entry in the database does not
resolve the problem. During a failover, Octavia starts a new
amphora VM and tries to add the VRRP port to the new VM. This step
fails because the database is still in an inconsistent state. We
did not find a way to update the databases to allow for a regular
loadbalancer failover, but assume that potentially several tables
would have to be self-consistently changed.

While such further database updates are possible the risks of
creating an inconsistent database state with potential impact
beyond the loadbalancer are significant. It might be more
acceptable and would certainly be safer to instead delete the
failed loadbalancer and re-create it.

James Page (james-page)
affects: charm-octavia → octavia (Ubuntu)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.