Seeing an issue with tempest.api.compute.admin.test_live_migration.LiveMigrationTest. test_live_migration_with_trunk failing intermittently. The live migration appears to complete successfully but the subport fails to become ACTIVE within the 60 second timeout:
2022-01-27 21:49:54,119 107445 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
Body: None
Response - Headers: {'date': 'Thu, 27 Jan 2022 21:49:54 GMT', 'server': 'Apache/2.4.41 (Ubuntu)', 'content-type': 'application/json', 'content-length': '782', 'x-openstack-request-id': 'req-79ecb361-fe12-4efe-9d7d-323378afabdf', 'connection': 'close', 'status': '200', 'content-location': 'https://10.176.196.23:9696/v2.0/ports/94892f6e-22d8-4707-bfd6-05ae6159c1e5'}
Body: b'{"port":{"id":"94892f6e-22d8-4707-bfd6-05ae6159c1e5","name":"tempest-subport-887366041","network_id":"d60604d8-e700-41d5-8d80-ae2f840e7a9e","tenant_id":"2fe75fa1bd044fb89156fa9b6245a9af","mac_address":"fa:16:3e:74:a4:4d","admin_state_up":true,"status":"DOWN","device_id":"","device_owner":"trunk:subport","fixed_ips":[{"subnet_id":"cd95ee55-6cb3-4401-be11-53207a7794c9","ip_address":"19.80.0.197"}],"allowed_address_pairs":[],"extra_dhcp_opts":[],"security_groups":["9eef9e28-4f49-4f80-9680-af8011de0c8b"],"description":"","binding:vnic_type":"normal","port_security_enabled":true,"qos_policy_id":null,"qos_network_policy_id":null,"tags":[],"created_at":"2022-01-27T21:47:56Z","updated_at":"2022-01-27T21:48:56Z","revision_number":6,"project_id":"2fe75fa1bd044fb89156fa9b6245a9af"}}'
2022-01-27 21:49:59,125 107445 DEBUG [tempest.lib.common.utils.test_utils] Call _is_port_status_active returns false in 60.000000 seconds
}}}
Traceback (most recent call last):
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper
return func(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
return f(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/api/compute/admin/test_live_migration.py", line 281, in test_live_migration_with_trunk
self.assertTrue(
File "/usr/lib/python3.8/unittest/case.py", line 765, in assertTrue
raise self.failureException(msg)
AssertionError: False is not true
This is where the tempest test is failing [1]:
# Wait till subport status is ACTIVE
self.assertTrue(
test_utils.call_until_true(
self._is_port_status_active, CONF.validation.connect_timeout,
5, subport['id']))
The example instance ^ d0ef464c-2c66-4619-8d5d-e0eb0568f545 is shown to have migrated successfully [2]:
Jan 27 21:48:58.568894 ubuntu-focal-rax-iad-0028206891 nova-compute[53318]: INFO nova.compute.manager [None req-0fb6b829-6b7e-460f-96f2-5774ba2fbc3e None None] [instance: d0ef464c-2c66-4619-8d5d-e0eb0568f545] Migrating instance to ubuntu-focal-rax-iad-0028206890 finished successfully.
Checking in the neutron logs (q-svc and q-agt) I did not find any errors for port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 but these messages in q-svc might(?) be relevant [3]:
Jan 27 21:48:56.252261 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 no longer exists at agent ovs-agent-ubuntu-focal-rax-iad-0028206891 {{(pid=90137) update_device_down /opt/stack/neutron/neutron/plugins/ml2/rpc.py:259}}
[...]
Jan 27 21:48:56.585883 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.plugin [None req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Current status of the port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 is: ACTIVE; New status is: DOWN {{(pid=90137) _update_individual_port_db_status /opt/stack/neutron/neutron/plugins/ml2/plugin.py:2225}}
[...]
Jan 27 21:48:56.675076 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.notifiers.nova [None req-8069a001-aba6-49cb-a9b7-8812281db396 None None] device_id is not set on port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 yet. {{(pid=90137) _can_notify /opt/stack/neutron/neutron/notifiers/nova.py:194}}
[...]
Jan 27 21:48:56.763220 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.db [None req-8069a001-aba6-49cb-a9b7-8812281db396 None None] For port 94892f6e-22d8-4707-bfd6-05ae6159c1e5, host ubuntu-focal-rax-iad-0028206891, got binding levels [PortBindingLevel(driver='openvswitch',host='ubuntu-focal-rax-iad-0028206891',level=0,port_id=94892f6e-22d8-4707-bfd6-05ae6159c1e5,segment=NetworkSegment(a64eebe3-ed10-423e-89ad-949a28f525d6),segment_id=a64eebe3-ed10-423e-89ad-949a28f525d6)] {{(pid=90137) get_binding_level_objs /opt/stack/neutron/neutron/plugins/ml2/db.py:74}}
[...]
Jan 27 21:48:56.947538 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.api.rpc.handlers.resources_rpc [None req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Pushing event updated for resources: {'Port': ['ID=94892f6e-22d8-4707-bfd6-05ae6159c1e5,revision_number=6']} {{(pid=90137) push /opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:237}}
[...]
Jan 27 21:48:58.985314 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 up at agent ovs-agent-ubuntu-focal-rax-iad-0028206890 {{(pid=90137) update_device_up /opt/stack/neutron/neutron/plugins/ml2/rpc.py:296}}
Jan 27 21:48:59.121212 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.plugin [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] The host ubuntu-focal-rax-iad-0028206890 is not matching for port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 host ubuntu-focal-rax-iad-0028206891! {{(pid=90137) port_bound_to_host /opt/stack/neutron/neutron/plugins/ml2/plugin.py:2323}}
Jan 27 21:48:59.121805 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 not bound to the agent host ubuntu-focal-rax-iad-0028206890 {{(pid=90137) update_device_up /opt/stack/neutron/neutron/plugins/ml2/rpc.py:304}}
Then the last 3 log messages ^ repeat once again and then after that all I see is the periodic polling of port 94892f6e-22d8-4707-bfd6-05ae6159c1e5, presumably the tempest test waiting for it to go to ACTIVE.
[1] https://github.com/openstack/tempest/blob/8c4b99e26be51272692f51b00a534ee15dcf74d1/tempest/api/compute/admin/test_live_migration.py#L281
[2] https://zuul.opendev.org/t/openstack/build/c8f780748b10417a90dc74db65f8dd3d/log/compute1/logs/screen-n-cpu.txt#10973
[3] https://zuul.opendev.org/t/openstack/build/c8f780748b10417a90dc74db65f8dd3d/log/controller/logs/screen-q-svc.txt#11519
I think this is a duplicate of https:/ /bugs.launchpad .net/neutron/ +bug/1940425 But if I mark this as duplicate then we loose this from the nova gate-failure query. :/