Heat does not delete ports from failed instances

Bug #1766763 reported by Julia Kreger
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Heat
In Progress
Undecided
Julia Kreger

Bug Description

With TripleO heat templates, an internal port is utilized with baremetal nodes. This is important because if a deployment of a baremetal node fails, the port appears not to be deleted. This is important because when used with baremetal nodes where the MAC address has been set externally to match the actual physical baremetal.

Since instances are created with a port in those scenarios, when torn down due to a failure, nova will not remove the port since it was created outside of Nova's context. This is exposed as preserve_on_delete attribute in the instance metadata.

This is important since because the lack of the port being deleted prevents the physical baremetal node from being re-used.

Nova code for port "addition" and "deletion" with pertinent areas:

creation:
https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L1090
deletion:
https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L1321
https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L1282

This ends up demonstrating its self as a deployment of new instances failing with ultimately ironic reporting an error along these lines:

2018-04-24 11:24:51.401 21530 ERROR ironic.common.neutron MacAddressInUseClient: Unable to complete operation for network 44bd180a-5344-47ad-8459-a47b5ad382d9. The mac address 52:54:00:1c:50:81 is in use.

Ironic can't route around this issue because it has no means to identify if the other port can be deleted. Due to the nova behavior, realistically heat needs to clean up upon trying to replace the instance.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/564104

Changed in heat:
assignee: nobody → Julia Kreger (juliaashleykreger)
status: New → In Progress
Revision history for this message
Rabi Mishra (rabi) wrote :

1. when replacing instances (we do that when updating a FAILED server resources), we call detach_interface[1] when preparing for replacement, which seems to delete the internal port unless preserve_on_delete is True[2].

2. When deleting instances we explicitly delete[3] the ports from neutron, which would delete them irrespective of the flag.

However, We do delete the instances after the replacement is successful which would try delete the old port too.

To me it looks like the implementation in heat assumed that detach_interface would not delete the port, as we would need to attach that back to the old instance when doing a rollback.

We can delete the ports explicitly after calling detach_interface but that would break the rollback use case which is anyway broken atm:D

[1] https://github.com/openstack/heat/blob/master/heat/engine/resources/openstack/nova/server_network_mixin.py#L547

[2] https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L2317

[3] https://github.com/openstack/heat/blob/master/heat/engine/resources/openstack/nova/server_network_mixin.py#L157

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

In testing the patch I proposed, it seems to help but we still seem to orphan some ports... although this does appear to be a case of resource exhaustion where a failed deploy can't be replaced because the number of instances requested exactly matches the number of available baremetal nodes. On a plus side, it is failing at that very last node now which is a positive sign for the patch in my mind. (tl;dr I'm testing in an environment with BMC issues which can cause deployments to baremetal nodes to fail quite reliably) :|

Is the behavior to wait on deleting the failed instance tunable at all? That would seemingly address the issue we're encountering.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

So thinking about it a little further, it might make sense to lookup and delete the internal ports if the server is in error state. Since replace doesn't imply rebuild, as far as I'm aware. I think ironic is the only case where rebuild deploys to the same node.

Revision history for this message
Rabi Mishra (rabi) wrote :

> In testing the patch I proposed, it seems to help but we still seem to orphan some ports

We create an 'internal port' for the new instances only if a 'subnet' is specified[1] before making calls to nova. So if nova fails to schedule an instance subsequently the port is left behind. However it would be deleted, when heat deletes that instance or replaces the instance after your patch. So there should be as many number of orphaned ports as the number of FAILED instances, which should be fine IMO.

The only drawback/inefficiency with your patch is that we would create a new internal port every time when replacing an instance, irrespective of the fact that nothing has changed for the servers networks property and the server may be get a different ip (unless fixed_ip is specified)

Ideally we should have been able to attach the existing port to the new instance. Though I think it would be complicated as we also have to check if the subnet has changed and we need a new internal port instead.

[1] https://github.com/openstack/heat/blob/master/heat/engine/resources/openstack/nova/server_network_mixin.py#L234

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

The problem is not a failure to schedule, but a failure for the deployment to be successful resulting in the instance in ERROR state shortly after the node spawn action occurs in nova. But we would never be able to re-use the port in the baremetal context because it has changed, the binding profile has been set/asserted and changed... so realistically we might not be able to re-use the port or IP allocation which I believe it does already.

Thinking about it a little further, with any complex network involving an ML2 driver, the port would have to be completely detached, unbound, and removed, and I think the only way to do that would be through deletion as ML2 drivers may have wired networks through switch fabrics to support an instance on a compute node, and just moving that port may orphan that in the switch fabric.

That same issue actually exists for baremetal as well as an ML2 driver mapping networks through to a compute host.

The only way to avoid that would be to detach, unbind, and then sanity check the remaining metadata about the port (mac address included, because a conflict is a hard failure for neutron) to ensure it matches the initial state, and if something does not match the initial state then delete it. Then again, deleting it might just be easier... :|

Anyway, I'm happy to add a check for replacement if the prior instance is only in ERROR state, which should still enable rollback in other cases, that is once rollback works again. (Granted, an instance in error state may be able to be recovered through rebuild or some other action, but I'm not sure that is entirely in scope for an orchestration tool.)

Revision history for this message
Harald Jensås (harald-jensas) wrote :

Zane made point on the patch review. Deleting implicitly created ports is not the full solution to this problem. This issue is there also when using non-implicit ports.

Example:
 openstack port create --network private test-port1
 openstack port create --network private test-port2

  openstack server create \
    --image cirros-0.3.5-x86_64-disk \
    --port test-port1\
    --flavor baremetal \
    test-server

  openstack server delete test-server

  openstack server create \
    --image cirros-0.3.5-x86_64-disk \
    --port test-port2 \
    --flavor baremetal \
    test-server

The second server create command fails with:
fault:
  code: 500
  created: '2018-04-29T12:21:37Z'
  < .. snip .. >
  message: 'Build of instance 5982f29a-3d48-4eb8-88ca-cf794c8f31dd aborted: Failure
    prepping block device.'

An in conductor log:
Apr 29 14:21:35 ironic-devstack.lab.example.com ironic-conductor[445]: ERROR ironic.common.neutron MacAddressInUseClient: Unable to complete operation for network 52497681-18f4-47de-b48e-1265799caea9. The mac address 52:54:00:53:b5:b9 is in use.

I think the proper solution would be for [1], unbind_neutron_port(), in ironic to unset the mac_address of the port. For this to work neutron would have to accept getting a port update with mac_address = None, and trigger the code to generate a mac address[2] again.

[1] https://github.com/openstack/ironic/blob/master/ironic/common/neutron.py#L91
[2] https://github.com/openstack/neutron/blob/master/neutron/db/db_base_plugin_v2.py#L1244-L1251

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Harald your correct in that regard that if someone defines a port far upfront, submits it in for use, then... there is not much we can do short of possibly consider resetting the prior MAC address back in ironic.

Alternatively, it also feels like that perhaps heat should consider replacing all associated ports of instances that fail to be created in an attempt to retry, since they shouldn't attempt to be re-used, imho. The conundrum is these sorts of things break rollback unless we just focus on instances in error state.

Rico Lin (rico-lin)
Changed in heat:
milestone: none → no-priority-tag-bugs
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.