Hello,
I am seeing strange behaviour on my CentOS 8/OpenStack Ussuri test cluster. Note that I am using OpenvSwitch in DVR mode with the OpenvSwitch firewall driver without router HA configured.
I have two VMs on a compute node in the same L2 segment, one of them with a floating IP and one without. Incoming connections to the floating IP work as expected until the other VM sends any traffic to the Internet. As soon as the second VM sends traffic, the incoming connection to the floating IP stops working; new connection can not be established as well.
After stopping incoming traffic to the floating IP and outgoing traffic from the second VM and subsequently waiting 30-60s new incoming connections to the floating IP can be established again.
Traffic between the private IPs of both VMs works flawlessly and does not have impact on incoming connections to the floating IP.
With explicitly_egress_direct is set to True the incoming traffic is forwarded to the network node, and I can capture the traffic on the vxlan_sys_4789 interface on both nodes (the compute and the network node).
If explicitly_egress_direct is not set in the configuration traffic is
broadcasted on the br-int of the compute node and is also forwarded to the
network node.
The traffic reaches the VM with the floating IP which sends return traffic, so the already established connection is working, but I cannot establish new connections.
Packet captures on vxlan_sys_4789 show the traffic both on the compute and
network node.
If I use no firewall driver and explicitly_egress_direct is not set in the
configuration, the incoming traffic is also broadcasted on br-int. The
established connection is working, and I can establish new connections, but all the incoming traffic is broadcasted.
The packet capture shows, that the destination MAC of the incoming traffic is the correct MAC of the VM.
The established connection is listed in conntrack table but the new connection attempts are not showing up.
What else can I do to isolate the problem?
Best Regards
Phil
ovs_version: "2.12.0"
Kernel 4.18.0-147.8.1.el8_1.x86_64
Flow dumps on comute node
==========================================
VXLAN underlay Net: 10.0.2.0/24
Provider Net: 192.168.97.0/24
Internal Net: 10.10.10.0/24
Floting IP: 192.168.97.161
VM with floting IP: 10.10.10.185 (fa:16:3e:e7:c3:cb)
VM without flowting IP: 10.10.10.242 (fa:16:3e:f9:c2:b7)
ovs-appctl dpctl/show
system@ovs-system:
lookups: hit:317 missed:117 lost:0
flows: 4
masks: hit:1176 total:3 hit/pkt:2.71
port 0: ovs-system (internal)
port 1: br-ex (internal)
port 2: enp3s0 <<===== External Interface
port 3: br-int (internal)
port 4: br-tun (internal)
port 5: qr-1f6bbe11-9b (internal)
port 6: fg-0c191c26-85 (internal)
port 7: vxlan_sys_4789 (vxlan: packet_type=ptap)
port 8: tapf494600d-62 <<==== VM with flowting IP
port 9: tapbd3a7589-3f <<==== VM without flowting IP
firewall NONE / explicitly_egress_direct TRUE
---------------------------------------------
WORKING
--------------------------------
ovs-appctl dpctl/dump-flows
recirc_id(0),in_port(5),eth(src=fa:16:3e:4f:ac:f8,dst=fa:16:3e:e7:c3:cb),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:392, used:0.957s, actions:8
recirc_id(0),in_port(8),eth(src=fa:16:3e:e7:c3:cb,dst=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:392, used:0.957s, actions:5
recirc_id(0),in_port(2),eth(src=00:11:0a:66:b2:68,dst=fa:16:3e:5a:f0:65),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:4, bytes:408, used:0.957s, actions:pop_vlan,push_vlan(vid=1,pcp=0),3,pop_vlan,6
recirc_id(0),in_port(6),eth(src=fa:16:3e:5a:f0:65,dst=00:11:0a:66:b2:68),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:392, used:0.957s, actions:push_vlan(vid=97,pcp=0),2
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=00:17:e0:1f:63:94),eth_type(0x9000), packets:66, bytes:3960, used:0.305s, actions:drop
NOT WORKING - after ping from VM
--------------------------------
ovs-appctl dpctl/dump-flows
recirc_id(0),in_port(5),eth(src=fa:16:3e:4f:ac:f8,dst=fa:16:3e:e7:c3:cb),eth_type(0x0800),ipv4(frag=no), packets:127, bytes:12446, used:5.742s, actions:8
recirc_id(0),in_port(8),eth(src=fa:16:3e:e7:c3:cb,dst=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(frag=no), packets:127, bytes:12446, used:5.741s, actions:5
recirc_id(0),in_port(5),skb_mark(0x4000000),eth(src=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:5, bytes:490, used:0.734s, actions:set(tunnel(tun_id=0x1,src=10.0.2.100,dst=10.0.2.20,ttl=64,tp_dst=4789,flags(df|key))),set(eth(src=fa:16:3f:9c:aa:5e)),set(skb_mark(0)),7
recirc_id(0),tunnel(tun_id=0x1,src=10.0.2.20,dst=10.0.2.100,flags(-df-csum+key)),in_port(7),eth(src=fa:16:3e:82:99:59,dst=fa:16:3e:f9:c2:b7),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:9
recirc_id(0),in_port(2),eth(src=00:11:0a:66:b2:68,dst=fa:16:3e:5a:f0:65),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:132, bytes:13464, used:0.734s, actions:pop_vlan,push_vlan(vid=1,pcp=0),3,pop_vlan,6
recirc_id(0),in_port(6),eth(src=fa:16:3e:5a:f0:65,dst=00:11:0a:66:b2:68),eth_type(0x0800),ipv4(frag=no), packets:127, bytes:12446, used:5.741s, actions:push_vlan(vid=97,pcp=0),2
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=00:17:e0:1f:63:94),eth_type(0x9000), packets:8, bytes:480, used:8.269s, actions:drop
recirc_id(0),in_port(9),eth(src=fa:16:3e:f9:c2:b7,dst=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:5
firewall NONE / explicitly_egress_direct False
---------------------------------------------
WORKING
--------------------------------
ovs-appctl dpctl/dump-flows
recirc_id(0),in_port(8),eth(src=fa:16:3e:e7:c3:cb,dst=fa:16:3e:4f:ac:f8),eth_type(0x0806),arp(sip=10.10.10.185), packets:0, bytes:0, used:never, actions:5
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=00:17:e0:1f:63:94),eth_type(0x9000), packets:38, bytes:2280, used:5.542s, actions:drop
recirc_id(0),in_port(2),eth(src=00:11:0a:66:b2:68,dst=fa:16:3e:5a:f0:65),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,6
recirc_id(0),in_port(5),eth(src=fa:16:3e:4f:ac:f8,dst=fa:16:3e:e7:c3:cb),eth_type(0x0806), packets:0, bytes:0, used:never, actions:8
recirc_id(0),in_port(2),eth(src=00:11:0a:66:b2:68,dst=fa:16:3e:5a:f0:65),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:42, bytes:4284, used:0.665s, actions:pop_vlan,6
recirc_id(0),in_port(5),eth(src=fa:16:3e:4f:ac:f8,dst=fa:16:3e:e7:c3:cb),eth_type(0x0800),ipv4(frag=no), packets:42, bytes:4116, used:0.664s, actions:8
recirc_id(0),in_port(2),eth(src=fa:16:3e:86:13:c3,dst=33:33:00:00:00:02),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x86dd),ipv6(frag=no)), packets:0, bytes:0, used:never, actions:1,pop_vlan,push_vlan(vid=1,pcp=0),3,pop_vlan,6
recirc_id(0),in_port(6),eth(src=fa:16:3e:5a:f0:65,dst=00:11:0a:66:b2:68),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=97,pcp=0),2
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(8),eth(src=fa:16:3e:e7:c3:cb,dst=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(frag=no), packets:42, bytes:4116, used:0.664s, actions:5
recirc_id(0),in_port(6),eth(src=fa:16:3e:5a:f0:65,dst=00:11:0a:66:b2:68),eth_type(0x0800),ipv4(frag=no), packets:42, bytes:4116, used:0.664s, actions:push_vlan(vid=97,pcp=0),2
NOT WORKING - after ping from VM
--------------------------------
ovs-appctl dpctl/dump-flows
recirc_id(0),in_port(8),eth(src=fa:16:3e:e7:c3:cb,dst=fa:16:3e:4f:ac:f8),eth_type(0x0806),arp(sip=10.10.10.185), packets:0, bytes:0, used:never, actions:5
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=00:17:e0:1f:63:94),eth_type(0x9000), packets:44, bytes:2640, used:1.590s, actions:drop
recirc_id(0),in_port(2),eth(src=00:11:0a:66:b2:68,dst=fa:16:3e:5a:f0:65),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,6
recirc_id(0),in_port(9),eth(src=fa:16:3e:f9:c2:b7,dst=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:5
recirc_id(0),in_port(5),eth(src=fa:16:3e:4f:ac:f8,dst=fa:16:3e:e7:c3:cb),eth_type(0x0806), packets:0, bytes:0, used:never, actions:8
recirc_id(0),tunnel(tun_id=0x1,src=10.0.2.20,dst=10.0.2.100,flags(-df-csum+key)),in_port(7),eth(src=fa:16:3e:82:99:59,dst=fa:16:3e:f9:c2:b7),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:9
recirc_id(0),in_port(2),eth(src=00:11:0a:66:b2:68,dst=fa:16:3e:5a:f0:65),eth_type(0x8100),vlan(vid=97,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:92, bytes:9384, used:0.181s, actions:pop_vlan,6
recirc_id(0),in_port(9),skb_mark(0),eth(src=fa:16:3e:f9:c2:b7,dst=33:33:00:00:00:02),eth_type(0x86dd),ipv6(proto=58,tclass=0/0x3,frag=no),icmpv6(type=128/0xf8), packets:0, bytes:0, used:never, actions:push_vlan(vid=2,pcp=0),3,set(tunnel(tun_id=0x1,src=10.0.2.100,dst=10.0.2.20,ttl=64,tp_dst=4789,flags(df|key))),pop_vlan,7,set(tunnel(tun_id=0x1,src=10.0.2.100,dst=10.0.2.102,ttl=64,tp_dst=4789,flags(df|key))),7,set(tunnel(tun_id=0x1,src=10.0.2.100,dst=10.0.2.101,ttl=64,tp_dst=4789,flags(df|key))),7,set(tunnel(tun_id=0x1,src=10.0.2.100,dst=10.0.2.103,ttl=64,tp_dst=4789,flags(df|key))),7,5,8
recirc_id(0),in_port(9),eth(src=fa:16:3e:f9:c2:b7,dst=fa:16:3e:4f:ac:f8),eth_type(0x0806),arp(sip=10.10.10.242), packets:0, bytes:0, used:never, actions:5
recirc_id(0),in_port(5),skb_mark(0x4000000),eth(src=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:30, bytes:2940, used:0.181s, actions:push_vlan(vid=2,pcp=0),3,set(tunnel(tun_id=0x1,src=10.0.2.100,dst=10.0.2.20,ttl=64,tp_dst=4789,flags(df|key))),set(eth(src=fa:16:3f:9c:aa:5e)),pop_vlan,set(skb_mark(0)),7,set(eth(src=fa:16:3e:4f:ac:f8)),set(skb_mark(0x4000000)),8,9
recirc_id(0),in_port(6),eth(src=fa:16:3e:5a:f0:65,dst=00:11:0a:66:b2:68),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=97,pcp=0),2
recirc_id(0),in_port(2),eth(src=00:17:e0:1f:63:94,dst=01:00:0c:cc:cc:cc),eth_type(0/0xffff), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(8),eth(src=fa:16:3e:e7:c3:cb,dst=fa:16:3e:4f:ac:f8),eth_type(0x0800),ipv4(frag=no), packets:13, bytes:1274, used:0.180s, actions:5
recirc_id(0),in_port(6),eth(src=fa:16:3e:5a:f0:65,dst=00:11:0a:66:b2:68),eth_type(0x0800),ipv4(frag=no), packets:13, bytes:1274, used:0.180s, actions:push_vlan(vid=97,pcp=0),2
recirc_id(0),in_port(5),eth(src=fa:16:3e:4f:ac:f8,dst=fa:16:3e:f9:c2:b7),eth_type(0x0806), packets:0, bytes:0, used:never, actions:9
Neutron config
==========================================
Compute1
-----------------------------------------
neutron.conf
-----------------------
[DEFAULT]
transport_url = rabbit://openstack:*********@controller
auth_strategy = keystone
core_plugin = ml2
service_plugins = router
allow_overlapping_ips = true
notify_nova_on_port_status_changes = true
notify_nova_on_port_data_changes = true
global_physnet_mtu = 9000
max_l3_agents_per_router = 0
min_l3_agents_per_router = 1
[database]
connection = mysql+pymysql://neutron:*********@controller/neutron
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = *********
[oslo_concurrency]
lock_path = /var/lib/neutron/tmp
l3_agent.ini
-----------------------
[DEFAULT]
interface_driver = openvswitch
router_delete_namespaces = True
agent_mode = dvr
external_network_bridge =
ml2_conf.ini
------------------------
[DEFAULT]
[l2pop]
[ml2]
type_drivers = flat,vlan,gre,vxlan
tenant_network_types = vxlan
mechanism_drivers = openvswitch
segment_mtu = 1500
path_mtu = 9000
physical_network_mtus = provider:1500
extension_drivers = port_security
[ml2_type_flat]
flat_networks = provider
[ml2_type_geneve]
[ml2_type_gre]
[ml2_type_vlan]
network_vlan_ranges = provider
[ml2_type_vxlan]
vni_ranges = 1:1000
openvswitch_agent.ini
---------------------
[DEFAULT]
[agent]
tunnel_types = vxlan
veth_mtu = 9000
enable_distributed_routing = True
l2_population = True
arp_responder = True
[ovs]
local_ip = 10.0.2.100
bridge_mappings = provider:br-ex
integration_bridge = br-int
tunnel_bridge = br-tun
[securitygroup]
enable_security_group = True
enable_ipset = True
firewall_driver = openvswitch
Network
-----------------------------------------
neutron.conf
-----------------------
[DEFAULT]
core_plugin = ml2
service_plugins = router
allow_overlapping_ips = true
transport_url = rabbit://openstack:*********@controller
auth_strategy = keystone
notify_nova_on_port_status_changes = true
notify_nova_on_port_data_changes = true
global_physnet_mtu = 9000
max_l3_agents_per_router = 0
min_l3_agents_per_router = 1
[database]
connection = mysql+pymysql://neutron:*********@controller/neutron
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = **********
[oslo_concurrency]
lock_path = /var/lib/neutron/tmp
l3_agent.ini
-----------------------
[DEFAULT]
interface_driver = openvswitch
router_delete_namespaces = True
agent_mode = dvr_snat
external_network_bridge =
ml2_conf.ini
-----------------------
[DEFAULT]
[l2pop]
[ml2]
type_drivers = flat,vlan,gre,vxlan
tenant_network_types = vxlan
mechanism_drivers = openvswitch
segment_mtu = 1500
path_mtu = 9000
physical_network_mtus = provider:1500
extension_drivers = port_security
[ml2_type_flat]
flat_networks = provider
[ml2_type_geneve]
[ml2_type_gre]
[ml2_type_vlan]
network_vlan_ranges = provider
[ml2_type_vxlan]
vni_ranges = 1:1000
openvswitch_agent.ini
----------------------
[DEFAULT]
[agent]
tunnel_types = vxlan
veth_mtu = 9000
enable_distributed_routing = True
l2_population = True
arp_responder = True
[network_log]
[ovs]
local_ip = 10.0.2.20
bridge_mappings = provider:br-ex
integration_bridge = br-int
tunnel_bridge = br-tun
[securitygroup]
enable_security_group = true
enable_ipset = true
firewall_driver = openvswitch
[xenapi]
Controller
-----------------------------------------
neutron.conf
-----------------------
[DEFAULT]
core_plugin = ml2
service_plugins = router
allow_overlapping_ips = true
transport_url = rabbit://openstack:***********@controller
auth_strategy = keystone
notify_nova_on_port_status_changes = true
notify_nova_on_port_data_changes = true
global_physnet_mtu = 9000
router_distributed = True
debug = true
[cors]
[database]
connection = mysql+pymysql://neutron:***********@controller/neutron
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = ***********
[oslo_concurrency]
lock_path = /var/lib/neutron/tmp
[oslo_messaging_amqp]
[oslo_messaging_kafka]
[oslo_messaging_notifications]
driver = messagingv2
[oslo_messaging_rabbit]
[oslo_middleware]
[oslo_policy]
policy_file = /etc/neutron/policy.yaml
policy_default_rule = default
[privsep]
[ssl]
[nova]
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = nova
password = ***********
l3_agent.ini
-----------------------
[DEFAULT]
interface_driver = openvswitch
router_delete_namespaces = True
external_network_bridge =
ml2_conf.ini
-----------------------
[DEFAULT]
[l2pop]
[ml2]
type_drivers = flat,vlan,gre,vxlan
tenant_network_types = vxlan
mechanism_drivers = openvswitch,l2population
path_mtu = 9000
physical_network_mtus = provider:1500
extension_drivers = port_security
[ml2_type_flat]
flat_networks = provider
[ml2_type_vlan]
network_vlan_ranges = provider
[ml2_type_vxlan]
vni_ranges = 1:1000
openvswitch_agent.ini
---------------------
[DEFAULT]
[agent]
tunnel_types = vxlan
veth_mtu = 9000
enable_distributed_routing = True
l2_population = True
arp_responder = True
explicitly_egress_direct = True
[ovs]
local_ip = 10.0.2.10
bridge_mappings = provider:br-ex
integration_bridge = br-int
tunnel_bridge = br-tun
[securitygroup]
Hello,
I have new information.
If I use dvr_no_external as agent_mode on the compute nodes, it works.
Incoming connections to the floating IP are routed via the network note, the outgoing traffic also works and does not interrupt incoming connections.
If explicitly_ egress_ direct set to false, the return traffic from VM is broadcasted on br-int, witch is the expected behaviour. egress_ direct set to true the return traffic is not longer broadcasted on br-int and the incoming connections continue to work.
If explicitly_
But as soon as I switched back to dvr as agent_mode, it's broken.
Best Regards
Phil