Comment 0 for bug 2039768

Revision history for this message
Marcin Wilk (wilkmarcin) wrote :

This is actually a known upstream bug [1] however it affects OpenStack deployments with the OVS l2. Therefore I am opening also this LP to provide more context/findings.
Massive update stream of the kernel routing tables causes (the rtnetlink notifications) the ovs-vswitchd process to burn CPU. This is turn causes that the ovs-vswitchd it can't keep up the the updates causing OpenStack VM connectivity issues. I have seen this problem on the customer's production compute nodes (Focal/Ussuri) and this was affecting VM connectivity.

The /var/log/openvswitch/ovs-vswitchd.log contains following messages indicating the problem:
2023-10-18T14:27:46.912Z|04626|poll_loop|INFO|Dropped 4 log messages in last 11 seconds (most recently, 11 seconds ago) due to excessive rate
2023-10-18T14:27:46.912Z|04627|poll_loop|INFO|wakeup due to [POLLIN] on fd 43 (127.0.0.1:49868<->127.0.0.1:6633) at ../lib/stream-fd.c:157 (97% CPU usage)
2023-10-18T14:27:46.912Z|04628|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (97% CPU usage)
2023-10-18T14:27:46.913Z|04629|rconn|INFO|br-int<->tcp:127.0.0.1:6633: connecting...
2023-10-18T14:27:46.914Z|04630|rconn|INFO|br-data<->tcp:127.0.0.1:6633: connecting...
2023-10-18T14:27:47.986Z|02093|ovs_rcu(urcu4)|WARN|blocked 1001 ms waiting for main to quiesce
2023-10-18T14:27:48.985Z|02094|ovs_rcu(urcu4)|WARN|blocked 2000 ms waiting for main to quiesce
2023-10-18T14:27:50.986Z|02095|ovs_rcu(urcu4)|WARN|blocked 4001 ms waiting for main to quiesce
2023-10-18T14:27:54.985Z|02096|ovs_rcu(urcu4)|WARN|blocked 8000 ms waiting for main to quiesce
2023-10-18T14:27:57.817Z|04631|timeval|WARN|Unreasonably long 10905ms poll interval (3817ms user, 6811ms system)
2023-10-18T14:27:57.817Z|04632|timeval|WARN|context switches: 35 voluntary, 2722 involuntary
2023-10-18T14:27:57.817Z|04633|poll_loop|INFO|Dropped 2 log messages in last 11 seconds (most recently, 11 seconds ago) due to excessive rate
2023-10-18T14:27:57.817Z|04634|poll_loop|INFO|wakeup due to [POLLIN] on fd 43 (127.0.0.1:49868<->127.0.0.1:6633) at ../lib/stream-fd.c:157 (97% CPU usage)
2023-10-18T14:27:57.817Z|04635|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (97% CPU usage)
2023-10-18T14:27:57.818Z|04636|rconn|INFO|br-int<->tcp:127.0.0.1:6633: connected
2023-10-18T14:27:57.818Z|04637|rconn|INFO|br-data<->tcp:127.0.0.1:6633: connected
2023-10-18T14:27:58.910Z|02097|ovs_rcu(urcu4)|WARN|blocked 1000 ms waiting for main to quiesce
2023-10-18T14:27:59.910Z|02098|ovs_rcu(urcu4)|WARN|blocked 2000 ms waiting for main to quiesce
2023-10-18T14:28:01.910Z|02099|ovs_rcu(urcu4)|WARN|blocked 4000 ms waiting for main to quiesce
2023-10-18T14:28:05.910Z|02100|ovs_rcu(urcu4)|WARN|blocked 8000 ms waiting for main to quiesce
2023-10-18T14:28:08.830Z|04638|timeval|WARN|Unreasonably long 11012ms poll interval (3972ms user, 6725ms system)
2023-10-18T14:28:08.830Z|04639|timeval|WARN|context switches: 32 voluntary, 2304 involuntary
2023-10-18T14:28:08.830Z|04640|poll_loop|INFO|Dropped 1 log messages in last 11 seconds (most recently, 11 seconds ago) due to excessive rate
2023-10-18T14:28:08.830Z|04641|poll_loop|INFO|wakeup due to [POLLIN] on fd 43 (127.0.0.1:49868<->127.0.0.1:6633) at ../lib/stream-fd.c:157 (97% CPU usage)

There is the ovs mailing list discussion thread [2] that explains the reasons and mechanics of this behaviour. Even though [1] and [2] mention BGB full table update as a trigger for the faulty situation, I was able to reproduce the issue without BGB at all. Just updating the kernel routing table in the loop.

The simplest reproducer on pure Ubuntu 20.04.6 LTS, kernel 5.4.0-162-generic, openvswitch-switch 2.13.8-0ubuntu1.2, 1 vcpu:
1. sudo apt install openvswitch-switch
2. for i in {10..25}; do for k in {0..20}; do for j in {1..254}; do sudo ip route add 10.1$i.$k.$j/32 via <an IP on local network>; done; done; done
3. wait a few minutes and keep watching the /var/log/openvswitch/ovs-vswitchd.log

And the reproducer steps on the OpenStack nova compute host (Focal/Ussuri), Ubuntu 20.04.4 LTS, kernel 5.4.0-164-generic, openvswitch-switch 2.13.8-0ubuntu1.2, 4 vcpu
1. insert thousands of routing entries on a compute host (this took 90 minutes before I cancelled it):
for i in {10..25}; do for k in {0..255}; do for j in {1..254}; do sudo ip route add 10.1$i.$k.$j/32 via <an IP from local net>; done; done; done
2. on a second session observe the ovs-vswitchd is consuming a lots of CPU cycles, also watch the /var/log/openvswitch/ovs-vswitchd.log for the messages I provided above.
3. sometimes, an attempt to schedule a new VM to this compute node fails because it can't add the VM's port to the bridge.
os server create --image cirros --boot-from-volume 3 --flavor m1.tiny --key-name testkey --network private --host juju-5ef7f4-octavia-20.cloud.sts --os-compute-api-version 2.74 cirr111

nova-compute.log contains:
: libvirt.libvirtError: internal error: Unable to add port tap32029b67-2a to OVS bridge br-int
2023-10-18 14:28:31.840 2612 ERROR nova.virt.libvirt.driver [req-7a6b767b-c7e0-4671-a1b9-48c0b68d397e 93eeeebc2920450faaaf2395443505fd d6761dacbb0649189a07a4a1a191a8c0 - 6a4439622e71431c8b96073e33c3b7e1 6a4439622e71431c8b96073e33c3b7e1] [instance: 8bbdc758-83c8-43d6-a3ab-d620c21c53a4] Failed to start libvirt guest: libvirt.libvirtError: internal error: Unable to add port tap32029b67-2a to OVS bridge br-int

[1] https://github.com/openvswitch/ovs-issues/issues/185
[2] https://mail.openvswitch.org/pipermail/ovs-discuss/2022-October/052092.html