DPDK instances are failing to start: Failed to bind socket to /run/libvirt-vhost-user/vhu3ba44fdc-7c: No such file or directory
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Compute Charm |
Invalid
|
Undecided
|
Unassigned | ||
charm-layer-ovn |
Fix Released
|
High
|
Liam Young | ||
charm-ovn-chassis |
Fix Released
|
High
|
Unassigned |
Bug Description
== Env
focal/ussuri + ovn, latest stable charms
juju status: https:/
Hardware: Huawei CH121 V5 with MZ532,4*25GE Mezzanine Card,PCIE 3.0 X16 NICs + manually installed PMD for DPDK enablement (librte-
== Problem description
DPDK instance can't be launched after the fresh deployment (focal/ussuri + OVN, latest stable charms), raising a below error:
$ os server show dpdk-test-instance -f yaml
OS-DCF:diskConfig: MANUAL
OS-EXT-
OS-EXT-
OS-EXT-
OS-EXT-
OS-EXT-
OS-EXT-
OS-EXT-
OS-SRV-
OS-SRV-
accessIPv4: ''
accessIPv6: ''
addresses: ''
config_drive: 'True'
created: '2021-09-
fault:
code: 500
created: '2021-09-
details: "Traceback (most recent call last):\n File \"/usr/
, line 651, in build_instances\n scheduler_
/usr/
\ raise exception.
\ Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance\
\ 1bb2d1b7-
\ exited while connecting to monitor: 2021-09-
\ -chardev socket,
\ Failed to bind socket to /run/libvirt-
\ or directory\n"
message: 'Exceeded maximum number of retries. Exceeded max scheduling attempts 3
for instance 1bb2d1b7-
process exited while connecting to monitor: 2021-09-
-chardev '
flavor: m1.medium.
hostId: ''
id: 1bb2d1b7-
image: auto-sync/
key_name: ubuntu-keypair
name: dpdk-test-instance
project_id: cdade870811447a
properties: ''
status: ERROR
updated: '2021-09-
user_id: 13a0e7862c6641e
volumes_attached: ''
For the record, a "generic" instances (e.g non-DPDK/non-SRIOV) are scheduling/starting without any issues.
== Steps to reproduce
openstack network create --external --provider-
openstack subnet create --allocation-pool start=<
openstack aggregate create --zone nova dpdk
openstack aggregate set --property dpdk=true dpdk
openstack aggregate add host dpdk <fqdn>
openstack aggregate show dpdk --max-width=80
openstack flavor set --property aggregate_
openstack server create --config-drive true --network ext_net_dpdk --key-name ubuntu-keypair --image focal --flavor m1.medium.dpdk dpdk-test-instance
== Analysis
[before redeployment] nova-compute log : https:/
[fresh deployment] juju crashdump: https:/
<on hypervisor>
# ovs-vsctl get open_vswitch . other_config
{dpdk-extra=
# cat /etc/tmpfiles.
# Create libvirt writeable directory for vhost-user sockets
d /run/libvirt-
In fact, none of the compute hosts have that file: https:/
After doing the below command, that missing /run/... file has appeared and VM could have been scheduled and started. However, although it have been started, it wasn't reachable over the network.
# systemd-tmpfiles --create
# stat /run/libvirt-
File: /run/libvirt-
Size: 40 Blocks: 0 IO Block: 4096 directory
description: | updated |
Changed in neutron (Ubuntu): | |
status: | New → Invalid |
no longer affects: | neutron |
no longer affects: | neutron (Ubuntu) |
Changed in charm-layer-ovn: | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → Liam Young (gnuoy) |
Changed in charm-layer-ovn: | |
milestone: | none → 21.10 |
Changed in charm-layer-ovn: | |
status: | Fix Committed → Fix Released |
Changed in charm-ovn-chassis: | |
status: | Fix Committed → Fix Released |
+ field-critical, as one of the core cloud functionalities is affected and there's no known workaround yet.