Power sync using the Ironic driver queries all the nodes from Ironic when using Conductor Groups
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
In Progress
|
Undecided
|
Julia Kreger |
Bug Description
"""
While synchronizing instance power states, found 447 instances in the database and 8712 instances on the hypervisor.
"""
This is the warning message that we get when using conductor groups during a power sync.
Conductor groups allow to have dedicated nova-compute nodes to manage a set of Ironic nodes.
However, the "_sync_
First, this function gets all the nodes from the DB that are managed by the Nova compute node. Then it asks the "driver" to get all the instances. When using the Ironic driver, it returns all the nodes in Ironic! (When having thousands of nodes Ironic can also get several minutes to return, but that is a different bug)!.
Of course, then the comparison fails, returning the previous warn message.
There are different possibilities...
- We can change the ironic driver to return only the nodes from the conductor group that this Nova compute-node belongs. However, this is not good enough if the conductor group is managed by more than 1 Nova compute-node. Ironic doesn't know which Nova compute-node manages each node!
- We agree that this check doesn't bring a lot of value when using the Ironic driver. We just skip it if the Ironic driver is used.
Changed in nova: | |
status: | New → Confirmed |
assignee: | nobody → Julia Kreger (juliaashleykreger) |
So a possibility is to just use the in-driver cache to generate the cache, as it should have data representing what the is comprised inside the nova-compute process's conductor group view of nodes, so the counts *should* match, and removing it won't result in a list of all nodes in the ironic database if the cache is used. If memory serves, the code in the nova.virt.ironic driver which uses caching also will get the nodes *if* the cache is not populated and if memory serves it gets refreshed as resource tracking updates run.