[Backport 1447883] Restrict netmask of CIDR to avoid DHCP resync is not enough

Bug #1450142 reported by Kevin Benton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
Critical
Alexander Ignatov
6.0.x
Fix Released
Critical
Denis Meltsaykin
6.1.x
Fix Released
Critical
Kevin Benton
7.0.x
Fix Released
Critical
Alexander Ignatov

Bug Description

There are a couple of ways that a tenant can setup their subnet to cause the DHCP agent to go into a failure loop when it tries to create a port. This creates extra load on the agent that makes its response to other changes slower.

This is a backport of https://bugs.launchpad.net/neutron/+bug/1447883

Description
===========
Restrict netmask of CIDR to avoid DHCP resync is not enough.
https://bugs.launchpad.net/neutron/+bug/1443798

I'd like to prevent following case:

[Condition]
  - Plugin: ML2
  - subnet with "enable_dhcp" is True

[Operations]
A. Specify "[]"(empty list) at "allocation_pools" when create/update-subnet
---------------------------------------------------------------------------
$ $ curl -X POST -d '{"subnet": {"name": "test_subnet", "cidr": "192.168.200.0/24", "ip_version": 4, "network_id": "649c5531-338e-42b5-a2d1-4d49140deb02", "allocation_pools": []}}' -H "x-auth-token:$TOKEN" -H "content-type:application/json" http://127.0.0.1:9696/v2.0/subnets

Then, the dhcp-agent creates own DHCP-port, it is reproduced resync bug.

B. Create port and exhaust allocation_pools
---------------------------------------------------------------
1. Create subnet with 192.168.1.0/24. And, DHCP-port has alteady created.
   gateway_ip: 192.168.1.1
   DHCP-port: 192.168.1.2
   allocation_pools{"start": 192.168.1.2, "end": 192.168.1.254}
   the number of availability ip_addresses is 252.

2. Create non-dhcp port and exhaust ip_addresses in allocation_pools
   In this case, user creates a port 252 times.
   the number of availability ip_addresses is 0.

3. User deletes the DHCP-port(192.168.1.2)
   the number of availability ip_addresses is 1.

4. User creates a non-dhcp port.
   the number of availability ports are 0.
   Then, dhcp-agent tries to create DHCP-port. It is reproduced resync bug.

Changed in mos:
assignee: nobody → MOS Neutron (mos-neutron)
summary: - icehouse, juno, and kilo are susceptible to exception loops in the dhcp
- agent
+ [Backport 1447883] Restrict netmask of CIDR to avoid DHCP resync is not
+ enough
Changed in mos:
milestone: none → 6.1
description: updated
ruhe (ruhe)
tags: added: neutron
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-6.1/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.1/2014.2
Change author: Kevin Benton <email address hidden>
Review: https://review.fuel-infra.org/6555

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-6.1/2014.2)

Reviewed: https://review.fuel-infra.org/6555
Submitter: Ilya Shakhat <email address hidden>
Branch: openstack-ci/fuel-6.1/2014.2

Commit: d7a036d5f573af17a8a9988c9a302eea5de56c1b
Author: Kevin Benton <email address hidden>
Date: Tue May 12 10:34:17 2015

Don't resync on DHCP agent setup failure

There are various cases where the DHCP agent will try to
create a DHCP port for a network and there will be a failure.
This has primarily been caused by a lack of available IP addresses
in the allocation pool. Trying to fix all availability corner cases
on the server side will be very difficult due to race conditions between
multiple ports being created, the dhcp_agents_per_network parameter, etc.

This patch just stops the resync attempt on the agent side if a failure
is caused by an IP address generation problem. Future updates to the subnet
will cause another attempt so if the tenant does fix the issue they will
get DHCP service.

Change-Id: I0896730126d6dca13fe9284b4d812cfb081b6218
Closes-Bug: #1447883
Closes-Bug: #1450142
(backport for https://review.openstack.org/#/c/177174)
(cherry picked from commit db9ac7e0110a0c2ef1b65213317ee8b7f1053ddc)

Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

Verify:
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "6.1"
  openstack_version: "2014.2.2-6.1"
  api: "1.0"
  build_number: "432"
  build_id: "2015-05-18_03-43-53"
  nailgun_sha: "076566b5df37f681c3fd5b139c966d680d81e0a5"
  python-fuelclient_sha: "38765563e1a7f14f45201fd47cf507393ff5d673"
  astute_sha: "cb655a9a9ad26848bcd9d9ace91857b6f4a0ec15"
  fuel-library_sha: "1621cb350af744f497c35f2b3bb889c2041465d8"
  fuel-ostf_sha: "9ce1800749081780b8b2a4a7eab6586583ffaf33"
  fuelmain_sha: "0e970647a83d9a7d336c4cc253606d4dd0d59a60"
ubuntu+neutron+gre, 3 controller, 1 compute

Steps:
1) create network: neutron net-create test_net
2) create subnet with empty allocation pull: curl -X POST -d '{"subnet": {"name": "test_subnet", "cidr": "192.168.200.0/24", "ip_version": 4, "network_id": "e60858be-444e-4a4e-8214-33628f38abd9", "allocation_pools": []}}' -H "x-auth-token:da27c8f1b2b34c52b786f01c0a3d9332" -H "content-type:application/json" http://172.18.161.181:9696/v2.0/subnets
3) Try to boot vm in this network: error no host
4) Check logs dhcp-agent to find repeated reqest
There are no repeated reqests now (on iso 310 we can find this requests)

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-7.0/2015.1.0)

Reviewed: https://review.fuel-infra.org/9295
Submitter: mos-infra-ci <>
Branch: openstack-ci/fuel-7.0/2015.1.0

Commit: b31ba09215e4335d0e3b2a55008203a0a5d47842
Author: Alexander Ignatov <email address hidden>
Date: Tue Jul 14 10:49:09 2015

Merge the latest state of stable/kilo

Closes-Bug: #1450142
Closes-Bug: #1457123
Closes-Bug: #1430171
Closes-Bug: #1460655
Closes-Bug: #1454421
Closes-Bug: #1442334
Closes-Bug: #1466490

Change-Id: Ie44d227cb6be9375f7ae2a157eadea6cc9976bb5

Anna Babich (ababich)
tags: added: on verification
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-6.0-updates/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.0-updates/2014.2
Change author: Kevin Benton <email address hidden>
Review: https://review.fuel-infra.org/10785

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-6.0-updates/2014.2)

Reviewed: https://review.fuel-infra.org/10785
Submitter: mos-infra-ci <>
Branch: openstack-ci/fuel-6.0-updates/2014.2

Commit: f9a75520712a5003580eedc60672493225e0149b
Author: Kevin Benton <email address hidden>
Date: Fri Aug 28 11:47:42 2015

Don't resync on DHCP agent setup failure

There are various cases where the DHCP agent will try to
create a DHCP port for a network and there will be a failure.
This has primarily been caused by a lack of available IP addresses
in the allocation pool. Trying to fix all availability corner cases
on the server side will be very difficult due to race conditions between
multiple ports being created, the dhcp_agents_per_network parameter, etc.

This patch just stops the resync attempt on the agent side if a failure
is caused by an IP address generation problem. Future updates to the subnet
will cause another attempt so if the tenant does fix the issue they will
get DHCP service.

Change-Id: I0896730126d6dca13fe9284b4d812cfb081b6218
Closes-Bug: #1447883
Closes-Bug: #1450142
(backport for https://review.openstack.org/#/c/177174)
(cherry picked from commit db9ac7e0110a0c2ef1b65213317ee8b7f1053ddc)
(cherry picked from commit d7a036d5f573af17a8a9988c9a302eea5de56c1b)

Revision history for this message
Anna Babich (ababich) wrote :

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "187"
  build_id: "2015-08-18_03-05-20"
  nailgun_sha: "4710801a2f4a6d61d652f8f1e64215d9dde37d2e"
  python-fuelclient_sha: "4c74a60aa60c06c136d9197c7d09fa4f8c8e2863"
  fuel-agent_sha: "57145b1d8804389304cd04322ba0fb3dc9d30327"
  fuel-nailgun-agent_sha: "e01693992d7a0304d926b922b43f3b747c35964c"
  astute_sha: "e24ca066bf6160bc1e419aaa5d486cad1aaa937d"
  fuel-library_sha: "0062e69db17f8a63f85996039bdefa87aea498e1"
  fuel-ostf_sha: "17786b86b78e5b66d2b1c15500186648df10c63d"
  fuelmain_sha: "c9dad194e82a60bf33060eae635fff867116a9ce"

Verified on cluster: neutron+vxlan+l2pop, 3 controllers, 2 computes

Verification scenario
1. create network: neutron net-create test_net
2. get token: curl -i 'https://public.fuel.local:5000/v2.0/tokens' -X POST -H "Content-Type: application/json" -H "Accept: application/json" -d '{"auth": {"tenantName": "admin", "passwordCredentials": {"username": "admin", "password": "admin"}}}'
3. create subnet with empty allocation pool: curl -X POST -d '{"subnet": {"name": "test_subnet", "cidr": "192.168.200.0/24", "ip_version": 4, "network_id": "1e7fc248-f47c-4e84-b879-fca570ce836e", "allocation_pools": []}}' -H "x-auth-token:92f34b58eb304557aa0ab3b2597a7c6d" -H "content-type:application/json" https://public.fuel.local:9696/v2.0/subnets
4. Try to boot vm in this network: failed with error "No fixed IP addresses available for network: 1e7fc248-f47c-4e84-b879-fca570ce836e, not rescheduling"
5. Check logs of dhcp-agent to find repeated request: have only one network_create_end and one subnet_update_end calls for this scenario - http://paste.openstack.org/show/437302/

tags: removed: on verification
Revision history for this message
Vadim Rovachev (vrovachev) wrote :

verified on 6.0.

tags: added: 6.0 release-notes-done
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.