heal_instance_info_cache_interval help text is inaccurate

Bug #1996094 reported by sean mooney
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
sean mooney

Bug Description

the current help text for the config option is as follows

heal_instance_info_cache_interval¶

    Type

        integer
    Default

        60

    Interval between instance network information cache updates.

    Number of seconds after which each compute node runs the task of querying Neutron for all of its instances networking information, then updates the Nova db with that information. Nova will never update it’s cache if this option is set to 0. If we don’t update the cache, the metadata service and nova-api endpoints will be proxying incorrect network data about the instance. So, it is not recommended to set this option to 0.

    Possible values:

        Any positive integer in seconds.

        Any value <=0 will disable the sync. This is not recommended.

this is not correct.

when the value is set to 0 it will disable the periodic task so nova will not try to heal the cache but the cache will still get updated if neutron notifies us of a change.

every time neutron emits a network-changed event for an interface nova updates that interface in the info cache.

the reason that setting the value to 0 is not recommended has nothing to do with what the current help text states. it's not recommended because it is not tested today.

in large deployments with many compute agents it may be better for scaling to disable this periodic.

for environments that use ironic or a clustered hypervisor like vmware the default will not introduce a performance issue but it will take a very long time to heal all the nodes if there is an issue. that this would fix.

as a result it is likely better to diabel this longterm.

for now we should just fix the help text and add testing fo disabling this in a ci job.

Tags: db network
Changed in nova:
assignee: nobody → sean mooney (sean-k-mooney)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/939476

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/939476
Committed: https://opendev.org/openstack/nova/commit/b3f881572088b16f693bf52d932273429996ca60
Submitter: "Zuul (22348)"
Branch: master

commit b3f881572088b16f693bf52d932273429996ca60
Author: Sean Mooney <email address hidden>
Date: Thu Jan 16 18:02:57 2025 +0000

    Disable the heal instance info cache periodic task

    The _heal_instance_info_cache periodic task predates
    the introduction of the server external events API
    which is now the canonical way to refresh the cache.

    This change updates the default value of
    ``[compute]heal_instance_info_cache_interval``
    to -1 disabling it by default.

    The nova-ovs-hybrid-plug job is extended to test the
    legacy configuration value and the config override is removed
    from nova-next

    Closes-Bug: #1996094
    Related-Bug: #2089225
    Change-Id: I33ac91bb4f3ead51af2f7005002d5eb5078540d9

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 31.0.0.0rc1

This issue was fixed in the openstack/nova 31.0.0.0rc1 Epoxy release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.