Stack creation is failed due to Resource CREATE failed: WaitConditionTimeout: 0 of 5 received

Bug #1636428 reported by TatyanaGladysheva
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
High
Sergii Rizvan
7.0.x
Won't Fix
High
MOS Maintenance
8.0.x
Won't Fix
High
MOS Maintenance
9.x
Invalid
High
Sergii Rizvan

Bug Description

Steps to reproduce:
1. Create stack using the following template: https://paste.mirantis.net/show/2712/
2. Wait for stack creation

Expected results:
Stack is created successfully.

Actual results:
Instance is created successfully. Stack is in status CREATE_FAILED due to Resource CREATE failed: WaitConditionTimeout: resources.wait_condition: 0 of 5 received

From heat-engine.log:
2016-10-25 06:26:49.827 2945 INFO heat.engine.resources.openstack.heat.wait_condition [-] HeatWaitCondition "wait_condition" Stack "TEST_STACK" [a8d49f86-131c-4921-b71c-1d30056af0be] Timed out (0 of 5 received)
2016-10-25 06:26:49.827 2945 INFO heat.engine.resource [-] CREATE: HeatWaitCondition "wait_condition" Stack "TEST_STACK" [a8d49f86-131c-4921-b71c-1d30056af0be]
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource Traceback (most recent call last):
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resource.py", line 504, in _action_recorder
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource yield
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resource.py", line 574, in _do_action
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource yield self.action_handler_task(action, args=handler_args)
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/scheduler.py", line 327, in wrapper
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource step = next(subtask)
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resource.py", line 548, in action_handler_task
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource while not check(handler_data):
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resources/openstack/heat/wait_condition.py", line 117, in check_create_complete
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource return self._wait(*data)
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resources/openstack/heat/wait_condition.py", line 96, in _wait
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource raise exc
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource WaitConditionTimeout: 0 of 5 received
2016-10-25 06:26:49.827 2945 TRACE heat.engine.resource
2016-10-25 06:26:50.892 2945 INFO heat.engine.stack [-] Stack CREATE FAILED (TEST_STACK): Resource CREATE failed: WaitConditionTimeout: resources.wait_condition: 0 of 5 received
2016-10-25 06:26:50.908 2945 INFO heat.engine.service [-] Stack create failed, status FAILED

Reproducibility:
Always for cluster with enabled TLS
Intermittently for clusters without TLS

Workaround:
Not Found

Impact:
Verification of bug https://bugs.launchpad.net/mos/+bug/1497273 is blocked by this bug.

Description of the environment:
FUEL/MOS 7.0 build 301, 8.0 build 570, TLS is enabled.

Additional information:
Issue is reproducible from time to time on 7.0, 8.0 clusters with disabled TLS.

Tags: area-heat
description: updated
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Bug https://bugs.launchpad.net/mos/+bug/1497273 was verified on MOS 7.0 + MU6 updates on cluster without TLS (despite of fact that issue from description was observed several times). After this bug will be fixed, bug 1497273 also should be checked on cluster with TLS.

tags: added: area-heat
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/heat (openstack-ci/fuel-8.0/liberty)

Fix proposed to branch: openstack-ci/fuel-8.0/liberty
Change author: danny <email address hidden>
Review: https://review.fuel-infra.org/28088

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/heat (openstack-ci/fuel-7.0/2015.1.0)

Fix proposed to branch: openstack-ci/fuel-7.0/2015.1.0
Change author: danny <email address hidden>
Review: https://review.fuel-infra.org/28089

Revision history for this message
Sergii Rizvan (srizvan) wrote :

Here is an upstream bug: https://bugs.launchpad.net/heat/+bug/1594441
As we can see it was fixed in Mitaka, so 9.x branch already contains the patch. That why status for 9.2 has been set as Invalid.

Revision history for this message
Sergii Rizvan (srizvan) wrote :

Note for QA: it's necessary to add next configuration to /etc/heat/heat.conf in order to changes make effect and restart heat-engine:

[clients_heat]
url=https://<Public Endpoing IP>:8004/v1/%(tenant_id)s
insecure = True

Revision history for this message
Sergey Kraynev (skraynev) wrote :

Sergii, I want to note, that using this option in config in production is not secure obviously, because it disables SSL everywhere in Heat, not only for Wait conditions.

So you may use it for testing, but not for production state.
Also I several times repeated, that it's expected behavior.

Wait condition should not work with SSL by default without any changes, because it requires SSl certificates on the booted VM. So there are several options how to deal with it in production:
 - enable TLS:
    a. Put ssl certificates to VM - it should be done by user and it's not so obvious, because these SSL certificates should be the same as on Contollers/computes of Openstack deployment
    b. manually set "--insecure" flag inside of Heat template (if we take a look on the https://paste.mirantis.net/show/2712/ this flag should be added in each line from 36 to 40).
    c. Use insecure option in config, as suggested in previous comment from Sergii. However he you need to take in mind, that it's real insecure - it disables SSl for whole Heat, not only waitconditions.
 - disable TLS:
     use mentioned template and Heat without changes :)

P.s. I will approve backports for proposed patch, it will allow to support case "c" in production.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/heat (openstack-ci/fuel-8.0/liberty)

Reviewed: https://review.fuel-infra.org/28088
Submitter: Pkgs Jenkins <email address hidden>
Branch: openstack-ci/fuel-8.0/liberty

Commit: e0a538e51f36847523d5aff6bd8ffb224643b8ce
Author: danny <email address hidden>
Date: Tue Nov 1 18:24:20 2016

Add --insecure in CURL if set True in client option

The CURL attribute is offen used as part of
the user data to signal back to heat in wait
handle. If the insecure option is set True in
the clients_heat option, we need to add
--insecure to the CURL url to make it work.

Change-Id: I153a9c71837ee61632e0cf39254bbbc66427b1de
Closes-Bug: #1636428

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/heat (openstack-ci/fuel-7.0/2015.1.0)

Reviewed: https://review.fuel-infra.org/28089
Submitter: Denis V. Meltsaykin <email address hidden>
Branch: openstack-ci/fuel-7.0/2015.1.0

Commit: 81838f83ccf38c2097b6399084a30bb3fdf87415
Author: danny <email address hidden>
Date: Tue Nov 1 18:34:00 2016

Add --insecure in CURL if set True in client option

The CURL attribute is offen used as part of
the user data to signal back to heat in wait
handle. If the insecure option is set True in
the clients_heat option, we need to add
--insecure to the CURL url to make it work.

Change-Id: I153a9c71837ee61632e0cf39254bbbc66427b1de
Closes-Bug: #1636428

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 7.0 + MU7 updates.

Actual results:
After the fix it's possible to create stack for cluster with Enabled TLS if to add next configuration to /etc/heat/heat.conf and restart heat-engine:
   [clients_heat]
   url=https://<Public Endpoing IP>:8004/v1/%(tenant_id)s
   insecure = True

tags: removed: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 8.0 + MU4 updates.

Actual results:
After the fix it's possible to create stack for cluster with Enabled TLS if to add next configuration to /etc/heat/heat.conf and restart heat-engine:
   [clients_heat]
   url=https://<Public Endpoing IP>:8004/v1/%(tenant_id)s
   insecure = True

tags: added: on-verification
tags: removed: on-verification
Revision history for this message
Sriram Kannan (shri-ramk) wrote :

“Heat scripts are failing in WaitCondition“

“Create_Failed: Resource CREATE failed: WaitConditionTimeout: resources.serverpair.resources[7].resources.serverpair.resources[1].resources.waitcondition: 0 of 1 received”

1. Fuel 7.0 MU 7 is used for deployment in openstack kilo with http endpoints
2. 3 controller cluster
3. All nodes have heat_metadata_server_url and heat_waitcondition_server_url pointed to public routable IP
4. Instances getting created successfully,stack is failing in wait condition.
5. 401 message seen in heat engine logs

Instance console log: -

+ echo 'Hello, World!'
+ cfn-create-aws-symlinks
+ cfn-signal -e 0 -r 'FINISH!!' 'http://xxxxxxxx:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3A876e627764f742238cff5ab5e7b814f8%3Astacks%2Fsfsdfsf%2Fe0e28440-2670-41d7-8209-ad9117f22149%2Fresources%2Fwait_handle?Timestamp=2017-04-07T04%3A30%3A51Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=5d1f9faab31d4947a7f55ffe973ebdb7&SignatureVersion=2&Signature=peXuxltNbUb0Yv5sH86TifKRJq4NMkiBBQBoQ0ZGfQE%3D'
DEBUG [2017-04-07 04:31:53,734] cfn-signal called Namespace(data='Application has completed configuration.', exit=None, exit_code='0', reason='FINISH!!', success='true', unique_id='00000', url='http://10.142.194.148:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3A876e627764f742238cff5ab5e7b814f8%3Astacks%2Fsfsdfsf%2Fe0e28440-2670-41d7-8209-ad9117f22149%2Fresources%2Fwait_handle?Timestamp=2017-04-07T04%3A30%3A51Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=5d1f9faab31d4947a7f55ffe973ebdb7&SignatureVersion=2&Signature=peXuxltNbUb0Yv5sH86TifKRJq4NMkiBBQBoQ0ZGfQE%3D')
DEBUG [2017-04-07 04:31:53,734] Running command: curl -X PUT -H 'Content-Type:' --data-binary '{"Status": "SUCCESS", "Reason": "FINISH!!", "Data": "Application has completed configuration.", "UniqueId": "00000"}' "http://xxxxxxxx:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3A876e627764f742238cff5ab5e7b814f8%3Astacks%2Fsfsdfsf%2Fe0e28440-2670-41d7-8209-ad9117f22149%2Fresources%2Fwait_handle?Timestamp=2017-04-07T04%3A30%3A51Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=5d1f9faab31d4947a7f55ffe973ebdb7&SignatureVersion=2&Signature=peXuxltNbUb0Yv5sH86TifKRJq4NMkiBBQBoQ0ZGfQE%3D"

Heat engine log : -

heat-api-cfn.log:2017-04-07 04:31:54.720 7167 INFO eventlet.wsgi.server [-] 192.168.12.19 - - [07/Apr/2017 04:31:54] "PUT /v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3A876e627764f742238cff5ab5e7b814f8%3Astacks%2Fsfsdfsf%2Fe0e28440-2670-41d7-8209-ad9117f22149%2Fresources%2Fwait_handle?Timestamp=2017-04-07T04%3A30%3A51Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=5d1f9faab31d4947a7f55ffe973ebdb7&SignatureVersion=2&Signature=peXuxltNbUb0Yv5sH86TifKRJq4NMkiBBQBoQ0ZGfQE%3D HTTP/1.1" 401 219 0.232454

Revision history for this message
Peter Razumovsky (prazumovsky) wrote :

Following bug investigation get next results:

I added to raw user_data several test lines: create new "check" file with test string, after that touch new log-file and write curl verbose output to it:

...
      user_data_format: RAW
      user_data:
        str_replace:
          template: |
            #!/bin/sh
            echo "Test!" > /check
            echo "Starting user_data script" > /wc_logs.log
            wc_notify --data-binary '{"status": "SUCCESS"}' --verbose >> /wc_logs.log 2>&1
            wc_notify --data-binary '{"status": "SUCCESS", "reason": "signal2"}' --verbose >> /wc_logs.log 2>&1
            wc_notify --data-binary '{"status": "SUCCESS", "reason": "signal3", "data": "data3"}' --verbose >> /wc_logs.log 2>&1
            wc_notify --data-binary '{"status": "SUCCESS", "reason": "signal4", "data": "data4"}' --verbose >> /wc_logs.log 2>&1
            wc_notify --data-binary '{"status": "SUCCESS", "id": "5"}' --verbose >> /wc_logs.log 2>&1
            wc_notify --data-binary '{"status": "SUCCESS", "id": "5"}' --verbose >> /wc_logs.log 2>&1
            echo "user_data script finished" >> /wc_logs.log
          params:
            wc_notify: { get_attr: ['wait_handle', 'curl_cli'] }
...

Get 7/10 CREATE_COMPLETE stacks. 3 failed stacks interrupted with waitcondition timeout => 0 of 5 signals received. When I ssh to nova instance of failed stack, there are neither "check" file nor "wc_logs.log" file with logs. After some debugging I found that heat correctly send user_data value to nova, but nova do nothing with it (or instance hanging and refuses run user_data script).

Based on this, I can concluse that basic incorrect behaviour is into nova, not heat.

Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Nova team, please take a look.

Also I'm going to move this bug to 8.0-MU-5 since it will take much more time to investigate and fix.

Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Moving this bug to backlog because it needs a thorough investigation which doesn't seem like it is worth it right now.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

We no longer support MOS5.1, MOS6.0, MOS6.1
We deliver only Critical/Security fixes to MOS7.0, MOS8.0.
We deliver only High/Critical/Security fixes to MOS9.2.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.