Power sync using the Ironic driver queries all the nodes from Ironic when using Conductor Groups

Bug #1933955 reported by Belmiro Moreira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Julia Kreger

Bug Description

"""
While synchronizing instance power states, found 447 instances in the database and 8712 instances on the hypervisor.
"""

This is the warning message that we get when using conductor groups during a power sync.

Conductor groups allow to have dedicated nova-compute nodes to manage a set of Ironic nodes.
However, the "_sync_power_states" doesn't deal correctly with it.

First, this function gets all the nodes from the DB that are managed by the Nova compute node. Then it asks the "driver" to get all the instances. When using the Ironic driver, it returns all the nodes in Ironic! (When having thousands of nodes Ironic can also get several minutes to return, but that is a different bug)!.

Of course, then the comparison fails, returning the previous warn message.

There are different possibilities...
- We can change the ironic driver to return only the nodes from the conductor group that this Nova compute-node belongs. However, this is not good enough if the conductor group is managed by more than 1 Nova compute-node. Ironic doesn't know which Nova compute-node manages each node!

- We agree that this check doesn't bring a lot of value when using the Ironic driver. We just skip it if the Ironic driver is used.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

So a possibility is to just use the in-driver cache to generate the cache, as it should have data representing what the is comprised inside the nova-compute process's conductor group view of nodes, so the counts *should* match, and removing it won't result in a list of all nodes in the ironic database if the cache is used. If memory serves, the code in the nova.virt.ironic driver which uses caching also will get the nodes *if* the cache is not populated and if memory serves it gets refreshed as resource tracking updates run.

Changed in nova:
status: New → Confirmed
assignee: nobody → Julia Kreger (juliaashleykreger)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/829613

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by "Julia Kreger <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/829613
Reason: Given that we're not going to be able to merge the prior patch, as is, this patch becomes really not viable as it is built upon the underlying problem being fixed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.