[10.0 swarm] IPv6 functionality doesn't work

Bug #1675778 reported by Sergey Novikov
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
In Progress
High
Oleg Bondarev

Bug Description

Detailed bug description:

The issue was found by [0], [1], [2], [3]

Steps to reproduce:
        1. Deploy
        2. Create two dualstack network IPv6 subnets
            (should be in SLAAC mode,
            address space should not intersect).
        3. Create virtual router and set gateway.
        4. Attach this subnets to the router.
        5. Create a Security Group,
            that allows SSH and ICMP for both IPv4 and IPv6.
        6. Launch two instances, one for each network.
        7. Lease a floating IP.
        8. Attach Floating IP for main instance.
        9. SSH to the main instance and ping6 another instance.

Actual result: Booted instance is not available by floating ip
TimeoutError: Instance e2b1523a-892a-47f7-999e-47bff2066ccd is unreachable for 300 seconds

Additional information:
http://paste.openstack.org/show/604047/
http://paste.openstack.org/show/604048/

[0] - https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.thread_1/226/testReport/fuel_tests.tests.test_neutron_ipv6/TestNeutronIPv6/test_deploy_neutron_ip_v6
[1] - https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.ovs_firewall_with_dpdk/40/testReport/(root)/deploy_ovs_firewall_and_dpdk_vlan_ipv6/
[2] - https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.ovs_firewall/39/testReport/(root)/deploy_non_ha_cluster_with_ovs_firewall_ipv6_vxlan/
[3] - https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.ovs_firewall/39/testReport/(root)/deploy_non_ha_cluster_with_ovs_firewall_ipv6_vlan/

Revision history for this message
Sergey Novikov (snovikov) wrote :
Changed in fuel:
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
status: New → Confirmed
tags: added: area-library
Revision history for this message
Atsuko Ito (yottatsa) wrote :

Possible cause:
2017-03-24 01:51:17.610 28865 DEBUG neutron.agent.linux.dhcp [req-b8e5325a-451e-414d-bff3-bc34fabca323 - - - - -] Unable to access /var/lib/neutron/dhcp/560fd076-9a6b-4edc-85be-1083db94c931/interface _get_value_from_conf_file /usr/lib/python2.7/dist-packages/neutron/agent/linux/dhcp.py:261
2017-03-24 01:51:17.611 28865 DEBUG neutron.agent.linux.dhcp [req-b8e5325a-451e-414d-bff3-bc34fabca323 - - - - -] Agent does not have an interface on this network anymore, skipping reload: 560fd076-9a6b-4edc-85be-1083db94c931 reload_allocations /usr/lib/python2.7/dist-packages/neutron/agent/linux/dhcp.py:501

Revision history for this message
Atsuko Ito (yottatsa) wrote :

It was initially failed with next debug http://paste.openstack.org/show/604364/, which is leading to https://github.com/openstack/neutron/commit/571af6bc37315ff023981904bd2592aa8ec0ec14 by kevinbenton. Summoned.

Revision history for this message
Atsuko Ito (yottatsa) wrote :

It looks like that IPv6 address was not created for DHCP-port for net1: see port-list in snapshot or below, then previously mentioned code was triggered.

| 798cefef-c4c1-482e-bbc0-acea52e6490d | | fa:16:3e:6c:da:20 | {"subnet_id": "869f4abf-c440-4b65-a9be-074776fadaf1", "ip_address": "192.168.100.2"} |

Revision history for this message
Atsuko Ito (yottatsa) wrote :

Removing area-library, since second VM is set up and fully running. Adding area-mos since race conditions was mentioned around spotted code.

tags: added: area-mos
removed: area-library
Atsuko Ito (yottatsa)
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS Neutron (mos-neutron)
Changed in fuel:
assignee: MOS Neutron (mos-neutron) → Ann Taraday (akamyshnikova)
status: Confirmed → In Progress
Revision history for this message
Alexey Shtokolov (ashtokolov) wrote :

Raised to Critical because there is no workaround

Changed in fuel:
importance: High → Critical
Revision history for this message
Oleg Bondarev (obondarev) wrote :

So the difference between nets 1 and 2 is that for net 1 dhcp port is first created for one ipv4 subnet:

2017-03-30 05:12:33.660 29848 DEBUG neutron.api.rpc.handlers.dhcp_rpc [req-e099424a-10a2-48f3-ad99-67428a81cb3f - - - - -] Create dhcp port {u'port': {u'name': u'', u'admin_state_up': True, u'network_id': u'29d6752b-027a-4eb9-aa73-711eff1b58ca', u'tenant_id': u'0080fdf7aa774a11bb260f5b8129446c', u'fixed_ips': [{u'subnet_id': u'3f81f975-5718-4bdc-878c-614f22b1b783'}], u'device_id': u'dhcp5dbbf756-2249-5cb9-b9b4-259ca353b226-29d6752b-027a-4eb9-aa73-711eff1b58ca'}}

and then with the creation of ipv6 subnet the port is updated to have ipv6 ip:

2017-03-30 05:12:38.990 29848 DEBUG neutron.api.rpc.handlers.dhcp_rpc [req-bcd62396-0e9a-4f39-8bf7-e56f0588805c - - - - -] Update dhcp port {u'port': {u'network_id': u'29d6752b-027a-4eb9-aa73-711eff1b58ca', 'binding:host_id': u'node-2.test.domain.local', u'fixed_ips': [{u'subnet_id': u'3f81f975-5718-4bdc-878c-614f22b1b783', u'ip_address': u'192.168.100.2'}, {u'subnet_id': u'8363ac60-c30d-43dc-a1d1-3d39820602fd'}]}, 'id': u'2e3a5343-a995-498a-85d2-db686d119fab'}

for some reason no ipv6 address is not allocated and port is returned to dhcp agent with still one ipv4 ip: agent considers it as error (it is actually) and tries to resync.

for net2 dhcp port is created with ip allocation from both ipv4 and ipv6 subnets:

2017-03-30 05:12:42.999 29848 DEBUG neutron.api.rpc.handlers.dhcp_rpc [req-fa681e8c-6831-4d7f-aec0-f42985c6b5f6 - - - - -] Create dhcp port {u'port': {u'name': u'', u'admin_state_up': True, u'network_id': u'4bb359f3-1800-4695-b8a9-e61a381052b7', u'tenant_id': u'0080fdf7aa774a11bb260f5b8129446c', u'fixed_ips': [{u'subnet_id': u'a22cd89d-831d-4ad7-9e1d-aab89f2fb9ac'}, {u'subnet_id': u'1f6f7b07-8947-434f-855b-5f9dec4417a1'}], u'device_id': u'dhcp5dbbf756-2249-5cb9-b9b4-259ca353b226-4bb359f3-1800-4695-b8a9-e61a381052b7'}}

and everything is ok.

Need to figure out why server can't allocate ipv6 address for dhcp port of net1

Revision history for this message
Oleg Bondarev (obondarev) wrote :

Looks like a race between ipv6 subnet create and network dhcp port create.

Neutron server adds an ipv6 address to a dhcp port in two cases:

 1) network already has ipv6 subnet by the time dhcp agent requests dhcp port creation - in this case agent includes both subnets into requested IPs of the port and both get allocated;
 2) ipv6 subnet is created after the network already has dhcp port existing - ipv6 IP then gets allocated on the dhcp port as part of subnet creation on the server side;

The bug reveals the third case:
 3) ipv6 subnet and dhcp port are created at the same time: so no ipv6 IP is requested for dhcp port by dhcp agent, as well as no ipv6 address is added to dhcp port as part of subnet creation;

In this case dhcp agent tries to reprocess network after subnet/port creation and updates IPs on the dhcp port:

 2017-03-30 05:12:38.990 29848 DEBUG neutron.api.rpc.handlers.dhcp_rpc [req-bcd62396-0e9a-4f39-8bf7-e56f0588805c - - - - -] Update dhcp port {u'port': {u'network_id': u'29d6752b-027a-4eb9-aa73-711eff1b58ca', 'binding:host_id': u'node-2.test.domain.local', u'fixed_ips': [{u'subnet_id': u'3f81f975-5718-4bdc-878c-614f22b1b783', u'ip_address': u'192.168.100.2'}, {u'subnet_id': u'8363ac60-c30d-43dc-a1d1-3d39820602fd'}]}, 'id': u'2e3a5343-a995-498a-85d2-db686d119fab'}

Server ignores ipv6 auto-address subnets in this request - to be fixed. This should be handled as in case 3, see proposed patch below.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (10.0/newton)

Fix proposed to branch: 10.0/newton
Change author: Oleg Bondarev <email address hidden>
Review: https://review.fuel-infra.org/32680

Revision history for this message
Oleg Bondarev (obondarev) wrote :

Possible workarounds (to be checked):
 - delete dhcp port of a network with the issue
 - add a delay between ipv4 and ipv6 subnets creation

Revision history for this message
Oleg Bondarev (obondarev) wrote :

Verified: dhcp port deletion fixes dhcp issue - agent recreates port and things get back to normal. Additionally need to reboot VMs so they issue DHCP requests and get back online.

Revision history for this message
Alexander Ignatov (aignatov) wrote :

Decreased the priority to High since workaround was found and verified. Suggest to go with this issue as Known Issue.

Changed in fuel:
importance: Critical → High
Changed in fuel:
assignee: Ann Taraday (akamyshnikova) → Oleg Bondarev (obondarev)
tags: added: release-notes
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/neutron (10.0/newton)

Change abandoned by Oleg Bondarev <email address hidden> on branch: 10.0/newton
Review: https://review.fuel-infra.org/32680
Reason: Upstream patch to track https://review.openstack.org/#/c/452195/

Revision history for this message
Oleg Bondarev (obondarev) wrote :
Revision history for this message
Oleg Bondarev (obondarev) wrote :
Revision history for this message
Oleg Bondarev (obondarev) wrote :

Upstream Newton patch was merged: https://review.openstack.org/#/c/453511/
Need to sync or backport manually

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (10.0/newton)

Fix proposed to branch: 10.0/newton
Change author: Oleg Bondarev <email address hidden>
Review: https://review.fuel-infra.org/33148

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.